-
- Notifications
You must be signed in to change notification settings - Fork 33.7k
Description
Bug report
Bug description:
Support for inlining list/dict/set comprehensions in c3b595e introduced a CO_FAST_HIDDEN, which is applied in combination with a different type code, for example CO_FAST_LOCAL. However, when the code object is copied via code.replace() function call, this additional flag is lost; consequently, execution of the returned code object results in a bizarre-looking error.
Example:
Consider the following example program
# program.py import sys if len(sys.argv) != 2: print(f"usage: {sys.argv[0]} <dir|locals|globals>") sys.exit(1) mode = sys.argv[1] # The comprehension must use same variable name as the code that attempts `del`. _allvalues = ''.join([myobj for myobj in ['a', 'b', 'c']]) myobj = None # for del below if mode == 'dir': print("DIR():", dir()) elif mode == 'locals': print("LOCALS():", locals()) elif mode == 'globals': print("GLOBALS():", globals()) del myobjand the following script that compiles the program to byte-code .pyc:
# compile_script.py import sys import os import struct import marshal import importlib.util if len(sys.argv) < 3: print(f"usage: {sys.argv[0]} <source> <dest> [0|1]") sys.exit(1) filename = sys.argv[1] out_filename = sys.argv[2] strip_co = False if len(sys.argv) < 4 else sys.argv[3] != '0' with open(filename, 'rb') as fp: src = fp.read() co = compile(src, filename, 'exec') if strip_co: co = co.replace() # In real use-case, we would be replacing filename here with open(out_filename, 'wb') as fp: fp.write(importlib.util.MAGIC_NUMBER) fp.write(struct.pack('<I', 0b01)) # PEP-552: hash-based pyc, check_source=False fp.write(b'\00' * 8) # Zero the source hash marshal.dump(co, fp)For some context, the above example is a distilled reproduction of what is going in PyInstaller and scipy.stats._distn_infrastructure module in pyinstaller/pyinstaller#7992: the collected module is byte-compiled, and the absolute filename in the code-object is anonymized into environment-relative path via co.replace() (see here for details).
But in the above example, no replacement is done, and so one would expect of co.replace() to return an identical code object.
However, this is not the case (even though co == co.replace() in python claims that they are identical):
$ python3.12 compile_script.py program.py compiled-orig.pyc 0 # Compile without co.replace() $ python3.12 compile_script.py program.py compiled-copy.pyc 1 # Compile with co.replace()$ sha256sum *.pyc 2e03af03bcbb41b3a6cc6f592f5143acf7d82edc089913504c1f8446764795e1 compiled-copy.pyc 5034955819efba0dc7ff3ee94101c1f6dfe33b102d547efc77577d77a99f1732 compiled-orig.pycRunning the original version:
$ python3.12 compiled-orig.pyc globals GLOBALS(): {'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <_frozen_importlib_external.SourcelessFileLoader object at 0x7fe7fb327830>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>, '__file__': '[...]/compiled-orig.pyc', '__cached__': None, 'sys': <module 'sys' (built-in)>, 'mode': 'globals', '_allvalues': 'abc', 'myobj': None} $ python3.12 compiled-orig.pyc dir DIR(): ['__annotations__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '_allvalues', 'mode', 'myobj', 'sys'] $ python3.12 compiled-orig.pyc locals LOCALS(): {'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <_frozen_importlib_external.SourcelessFileLoader object at 0x7f2846527830>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>, '__file__': '[...]/compiled-orig.pyc', '__cached__': None, 'sys': <module 'sys' (built-in)>, 'mode': 'locals', '_allvalues': 'abc', 'myobj': None}Running the version with co.replace():
$ python3.12 compiled-copy.pyc globals GLOBALS(): {'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <_frozen_importlib_external.SourcelessFileLoader object at 0x7fd7f1b27830>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>, '__file__': '[...]/compiled-copy.pyc', '__cached__': None, 'sys': <module 'sys' (built-in)>, 'mode': 'globals', '_allvalues': 'abc', 'myobj': None} $ python3.12 compiled-copy.pyc dir DIR(): ['__annotations__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '_allvalues', 'mode', 'sys'] Traceback (most recent call last): File "program.py", line 20, in <module> del myobj ^^^^^ NameError: name 'myobj' is not defined $ python3.12 compiled-copy.pyc locals LOCALS(): {'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <_frozen_importlib_external.SourcelessFileLoader object at 0x7f8a35d27830>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>, '__file__': '[...]/compiled-copy.pyc', '__cached__': None, 'sys': <module 'sys' (built-in)>, 'mode': 'locals', '_allvalues': 'abc'} Traceback (most recent call last): File "program.py", line 20, in <module> del myobj ^^^^^ NameError: name 'myobj' is not definedComparing the compiled-orig.pyc and compiled-copy.pyc in a hex editor, there is one byte of difference; its position corresponds to marshaled co_localspluskinds, and the value is 0x30 (CO_FAST_LOCAL | CO_FAST_HIDDEN) in original and 0x20 (CO_FAST_LOCAL) in copy variant.
CPython versions tested on:
3.12
Operating systems tested on:
Linux, Windows