Publish AI, ML & data-science insights to a global community of data professionals.

What is __pycache__ in Python?

Understanding the __pycache__ folder being created when running Python code

Photo by dimas aditya on Unsplash
Photo by dimas aditya on Unsplash

Introduction

You may have noticed that when executing Python code, a directory named __pycache__ is (sometimes) being created that contains numerous files with .pyc extension.

In today’s short tutorial we will discuss about the purpose of these files that are being created by the Python interpreter. We will discuss why the are generated in the first place, how to suppress their creation and how to ensure that they won’t be committed to any remote repository.


.pyc files and pycache folder

Python is an interpreted language which means that your source code is translated into a set of instructions that can be understood by CPUs at run-time . When running your Python program, the source code is compiled into bytecode which **** is an implementation detail of CPython (the original implementation of Python). The bytecode is also cached and stored in .pyc files so that the next time you re-run the code the execution of the same file will be faster.


Note that a couple of concepts discussed in this section around the interpreter and bytcode are oversimplified and only partially correct but they are more that enough to help you understand pyc files and __pycache__ folder. I am planning to cover these concepts in more detail in a future article.


Therefore, after the first execution of your source code, a __pycache__ folder will be created that includes several .pyc bytecode files sharing the same names as your .py files. As mentioned, these will be used in subsequent executions so that your program will start a little faster.

Every time your source code is being modified, will be recompiled and new bytecode files will be created again. Note that in some occasions this may not be true and Python will execute the code using cached files causing you some troubles. For instance, you may have fixed a bug but Python may be running on a buggy cached version. In this case you may have to delete the __pycache__ folder, or even suppress the creation of these files as illustrated in the following section.


Suppressing the creation of pycache

When using the CPython interpreter (which is the original implementation of Python anyway), you can suppress the creation of this folder in two ways.

The first option is to pass the -B flag when running your Python file. When the flag is provided, Python won’t try to write .pyc files on the import of source modules:

python3 -B my_python_app.py

Alternatively, you can set PYTHONDONTWRITEBYTECODE environment variable to any non-empty string. Again, this will prevent Python from trying to write .pyc files.

export PYTHONDONTWRITEBYTECODE=abc

Note that both approaches are equivalent.


Adding pycache to .gitignore file

When working in a local repository, Git will track every file under the Git repo. Every file, can be tracked (i.e. already staged and committed), untracked (not staged or committed) or ignored.

In most of the cases, you should ignore specific files such as those including sensitive data, system-specific files or auto-generated files that were created by say an IDE or a specific workspace.

The most elegant way to do this, is through the .gitignore file that lives in the top directory of the remote Git repository in which you can explicitly specify files or directories (regular expressions can also be applied) that Git will ignore and won’t track any longer.

__pycache__ is among the directories that shouldn’t be pushed to remote repositories. Therefore, all you need to do is specify the directory in .gitignore file.

# .gitignore
__pycache__/

Note that for Python projects in general, there are a lot more files that need to go into .gitignore. For a more comprehensive list, refer to this file.


Final Thoughts

In today’s article we discussed about cached bytecode in .pyc files and what purpose they serve along with the __pycache__ directory. Additionally, we explored how to suppress the creation of this directory and how to avoid including them in Git Commits (and therefore pushing them on remote repositories by accident).

One important caveat about bytecode is that the .pyc files will be used in cases where the equivalent .py file is no longer there. For example, if you have deleted or renamed a .py file but for some reason you can still see them getting executed in any possible way, then this may be the actual reason.


Become a member and read every story on Medium. Your membership fee directly supports me and other writers you read. You’ll also get full access to every story on Medium.

Join Medium with my referral link – Giorgos Myrianthous


You may also like

How to Implement a Linked List in Python


Iterables vs Iterators in Python


What is Duck Typing in Python?


Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

Related Articles