- Notifications
You must be signed in to change notification settings - Fork 52
Description
I haven't properly profiled anything. Throwing this idea here to see what people think.
Consider this fairly common pattern in Python:
for i in range(len(x)): ...Bytecode snippet:
6 GET_ITER >> 8 FOR_ITER 2 (to 14) Anecdotally, some users are surprised at how much overhead this has. For most simple for loops, users intend for the range objects to be used as the equivalent for(int i=0; i < x; i++) loop in C. The range objects are created then thrownaway immediately.
In those cases, calling next(range_iterator) in FOR_ITER is unnecessary overhead. We can unbox this into a simple PyLong object, then PyLong_Add on it. This can be a FOR_ITER_RANGE opcode.
This will have to be extremely conservative. We can only implement this on range objects with a reference count of 1 (ie used only for the for loop) and they must be the builtin range and not some monkeypatched version. Without the compiler’s help, the following would be very dangerous to optimize:
x = iter(range(10)) for _ in x: next(x)We can also do the same with lists and tuples (FOR_ITER_LIST and FOR_ITER_TUPLE), but use the the native PyTuple/List_GetItem instead of iterator protocol. But I'm not sure how common something like for x in (1,2,3) is. So maybe those aren't worth it (they're also much harder to rollback if the optimization breaks halfway).
FOR_ITER isn't a common instruction. But I think when it's used it matters, because it tends to be rather loopy code.
What do y'all think?