Mypyc Trace Logging

Overview

Mypyc can optionally include logging in the generated code that produces a sampled trace of various events or operations that happen during execution. For example, you can use this to find code locations where instances of a specific class are constructed, or identify the most commonly called method of a class.

The trace logs are line-based text files that use : as the field separator. They are are easy to analyze using ad-hoc Python scripts, Unix command-line tools or an SQL database, for example.

Before you start using this, it's recommended to have a look at the mypyc documentation, since some things may not make much sense otherwise.

Collecting a trace log

You will first need to compile the target code using trace logging enabled. When the code is executed, it will write the trace log into mypyc_trace.txt. If you use the mypyc command-line tool to compile your code, define the MYPYC_TRACE_LOG environment variable with value 1:

MYPYC_LOG_TRACE=1 mypyc ...

If you use mypycify(), pass log_trace=True as an extra argument.

If you are analyzing mypy, there is a helper script misc/log_trace_heck.py in the mypy GitHub repository that can compile mypy with trace log enabled and either perform a self check or type check a code fragment (e.g. -c "import some_module").

Trace logging is deterministic, so C compiler flags and hardware details have no impact. Also the target platform and Python version are unlikely to have a significant impact, unless some optimization requires a recent Python version.

Format of the log

Here is an example log output (details may change over time and this may not 100% reflect what you will see):

mypy.semanal.SemanticAnalyzer.flatten_lvalues::primitive_op:list_get_item_unsafe mypy.binder.ConditionalTypeBinder._get::primitive_op:int_gt mypy.semanal.SemanticAnalyzer.is_func_scope:7094:call_c:CPyList_GetItemShort mypy.typetraverser.TypeTraverserVisitor.traverse_type_tuple::primitive_op:var_object_size mypy.typeops.try_contracting_literals_in_union:1080:call_c:PyIter_Next mypy.renaming.LimitedVariableRenameVisitor.visit_name_expr:544:call_c:PySequence_Contains mypy.util.split_module_names:80:call_c:CPyList_GetItemShort mypy.expandtype.ExpandTypeVisitor.visit_instance::primitive_op:int_eq mypy.semanal.SemanticAnalyzer.process_type_annotation:3823:call_c:CPyList_GetItemShort

Let's look at a few log items in detail. First, this is a call to a native method:

mypy.types.Type.__init__:251:call:mypy.nodes.Context.__init__

The event was logged in mypy.types.Type._init__ on line 251 (based on the first two fields). The type of the event is call (a native call), and the called function was mypy.nodes.Context.__init__.

This event is a list get item operation:

mypy.semanal.SemanticAnalyzer.is_func_scope:7094:call_c:CPyList_GetItemShort

The event type call_c indicates a call of a primitive C function or a Python C API function. Python C API functions have the prefix Py while mypyc primitive functions have the prefix CPy.

This event indicates reading the object size of a variable-length object:

mypy.typetraverser.TypeTraverserVisitor.traverse_type_tuple::primitive_op:var_object_size

It's a low-level primitive operation (primitive_op) and usually isn't very interesting, since these operations are few fast.

This is another very fast operation (integer equality):

mypy.binder.ConditionalTypeBinder._get::primitive_op:int_gt

Notable C primitives and their semantics

In this section we will go through some common C primitives or C API functions that are useful to recognize in the trace log.

These are optimized primitives specialized based on the static types of expressions:

CPyList_GetItemShort: Get list item (a[i])
CPyList_GetItem: Get list item (another variant)
PyList_New: Create a list
PyList_Append: Append to a list (a.append(x))
PyList_Check: Equivalent to isinstance(x, list)
PyDict_New: Create a dict
PyDict_Contains: Dict contains (x in d)
CPyDict_GetItem: Get dict item (d[x])
CPyDict_GetWithNone: Get dict item (another variant)
CPyDict_SetItem: Set dict item (d[x] = y)
CPyStr_Equal: String equality
CPyTagged_Add: Add two integers
PySequence_Contains: x in <sequence>
PySequence_Tuple: Equivalent to tuple(x)
PyUnicode_Contains: Substring check
PyUnicode_Concat: Concatenate strings
PySet_New: Create a set
PySet_Add: Add an item to a set
PySet_Contains: Set contains (x in s)

These are slower, generic operations that work for arbitrary objects (including with Any types):

PyObject_GetIter: Equivalent to iter(x)
PyIter_Next: Equivalent to next(it)
PyObject_IsTrue: Is an object considered to be true
PyObject_RichCompare: Compare two objects
PyObject_Vectorcall: Call an object (much slower than optimized native calls)
PyObject_IsInstance: Equivalent to isinstance(x, t)
CPyObject_Hash: Equivalent to hash(x)
CPyObject_Size: Equivalent to len(x)
CPyObject_GetAttr3: Look up an attribute of an object the slow way
CPyObject_GetAttr: Another way to look up an attribute
PyObject_Not: Boolean not operation (not x)
CPyGen_SetStopIterationValue: Raise StopIteration with a value

These are internal operations that don't directly map to Python code:

CPy_NoErrOccurred: Check if an exception was raised by an operation or function call

Analyzing data using Unix command line tools

You can analyze data easily using standard Unix command line tools, including cut, sort, shuf, uniq and wc. We also use ripgrep (rg) to search for substrings and regular expressions, but grep also works.

Example: Commonly called primitives

Display the most commonly called C primitives through the call_c event:

$ rg :call_c: mypyc_trace.txt | cut -d':' -f4 | sort | uniq -c | sort -n ... 4970 PyIter_Next 4987 PyDict_Contains 5074 PyList_Append 7470 CPyStr_Equal 7623 PyList_New 7987 PySet_Contains 8489 CPyList_GetItemShort

We are doing lots of PyIter_Next operations, which are generic and thus somewhat slow. Maybe we can do better? The second example digs deeper into this.

Example: Where does `PyIter_Next` get called?

Display a random sample of 15 events that call PyIter_Next and sort by function:

$ rg PyIter_Next mypyc_trace.txt | shuf | tail -15 | sort mypy.binder.ConditionalTypeBinder.update_from_options:233:call_c:PyIter_Next mypy.binder.ConditionalTypeBinder.update_from_options:233:call_c:PyIter_Next mypy.graph_utils.prepare_sccs:69:call_c:PyIter_Next mypy.indirection.TypeIndirectionVisitor.find_modules:35:call_c:PyIter_Next mypy.indirection.TypeIndirectionVisitor.find_modules:35:call_c:PyIter_Next mypy.semanal_main.semantic_analyze_target:409:call_c:PyIter_Next mypy.solve.solve_constraints:61:call_c:PyIter_Next mypy.solve.solve_one:266:call_c:PyIter_Next mypy.types.CallableType.__init__:1887:call_c:PyIter_Next mypy.types.CallableType.__init__:1887:call_c:PyIter_Next mypy.types.CallableType.__init__:1887:call_c:PyIter_Next mypy.types.flatten_nested_tuples:3752:call_c:PyIter_Next mypy.type_visitor.TypeQuery.query_types:452:call_c:PyIter_Next mypy.type_visitor.TypeTranslator.translate_types:312:call_c:PyIter_Next mypy.type_visitor.TypeTranslator.translate_types:312:call_c:PyIter_Next

Group by function where call happens:

$ rg PyIter_Next mypyc_trace.txt | cut -d':' -f1 | sort | uniq -c | sort -n ... 108 mypy.type_visitor.TypeQuery.query_types 112 mypy.fastparse.ASTConverter.translate_stmt_list 115 mypy.typeops.FreezeTypeVarsVisitor.visit_callable_type 120 mypy.binder.ConditionalTypeBinder.update_from_options 190 mypy.messages.MessageBuilder.incompatible_argument 198 mypy.fastparse.ASTConverter.translate_opt_expr_list 279 mypy.expandtype.ExpandTypeVisitor.expand_types 351 mypy.indirection.TypeIndirectionVisitor.find_modules 396 mypy.type_visitor.TypeTranslator.translate_types 1252 mypy.types.CallableType.__init__

It looks like CallableType.__init__ does a lot of potentially slow generic iteration. On which source lines does this happen?

$ rg PyIter_Next mypyc_trace.txt | rg CallableType.__init__ | cut -d':' -f2 | sort | uniq -c | sort -n 1252 1887

Everything happens on line 1887.

Example: Are there many calls via the global namespace?

Calls to compiled functions and methods are usually compiled to fast C-level calls. However, sometimes mypyc falls back to slow Python calls. You can look for PyObject_Vectorcall, which is used for many such calls. Another option is to look at code that accesses the global namespace dictionary. Access Python namespaces is slow, since dictionary lookup is somewhat expensive, and dictionary values are untyped at runtime.

$ rg :python_call_global: mypyc_trace.txt | cut -d':' -f4 | sort | uniq -c | sort -n 3 defaultdict 5 FunctionContext 6 ErrorMessage 7 pop_on_exit 9 TypeRange 13 PartialTypeScope 44 CurrentType 120 FormalArgument

It looks like we are calling the classes FormalArgument and CurrentType and a few others via the global namespace. This indicates that the classes aren't native classes but either imported from Python libraries or mypyc falls back to compiling them into non-native classes. In the latter case the class may be a named tuple or a dataclass -- mypyc can't apply all optimizations yet for them.

Example: Which functions access attributes using the generic slow path?

The python_get_attr event indicates an access of an attribute using generic Python semantics, which behaves kind of like a dictionary lookup. Attributes of compiled classes can usually be accessed directly, which is many times faster.

$ rg :python_get_attr: mypyc_trace.txt | cut -d':' -f1 | sort | uniq -c | sort -n | tail -20 59 mypy.fastparse.TypeConverter.visit_Subscript 62 mypy.typeanal.TypeAnalyser.analyze_type_with_type_info 63 mypy.subtypes.are_parameters_compatible 71 mypy.fastparse.ASTConverter.visit_Constant 77 mypy.fastparse.ASTConverter.set_block_lines 81 mypy.checker.TypeChecker.visit_if_stmt 101 mypy.checkexpr.ExpressionChecker.check_argument_types 106 mypy.fastparse.ASTConverter.visit_Call 108 mypy.fastparse.ASTConverter.transform_args 110 mypy.semanal.SemanticAnalyzer.analyze_function_body 116 mypy.fastparse.ASTConverter.visit_Attribute 136 mypy.checkmember.analyze_instance_member_access 136 mypy.fastparse.TypeConverter.visit_Name 144 mypy.typeanal.TypeAnalyser.__init__ 158 mypy.fastparse.ASTConverter.visit_Name 177 mypy.checker.TypeChecker.check_func_def 212 mypy.fastparse.TypeConverter.visit 281 mypy.fastparse.ASTConverter.do_func_def 284 mypy.fastparse.ASTConverter.make_argument 904 mypy.fastparse.ASTConverter.set_line

It looks like most of these accesses are happening the the mypy.fastparse module. This makes sense, since the module converts data from Python objects constructed using the ast stdlib module, and these are not native classes, so mypyc can't access the attribute values directly.

Example: Are we accessing globals via the global namespace?

Look at the globals_dict_get_item event to find accesses of values via the module-level global namespace (a dictionary). These can be global variables, names imported from non-compiled modules, or classes that are compiled into regular (slow) Python classes. There are a few things that can help here:

Annotate a global variable as Final (e.g. FOO: Final = 5) if it's a constant value. Accessing final variables is quick.
If you access the value in a loop, cache the value in a local variable outside the loop. Mypyc assumes that the global dictionary can be modified at any time, so automatic optimizations are limited.
If it's a class object, you may be able to change it into a native class instead (this may be tricky, however).

Uh oh!

Mypyc Trace Logging

Overview

Collecting a trace log

Format of the log

Notable C primitives and their semantics

Analyzing data using Unix command line tools

Example: Commonly called primitives

Example: Where does PyIter_Next get called?

Example: Are there many calls via the global namespace?

Example: Which functions access attributes using the generic slow path?

Example: Are we accessing globals via the global namespace?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Example: Where does `PyIter_Next` get called?