Installation • Rules • Contributing • License
MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models. This repository holds the competition rules and the benchmark code to run it.
-
Create new environment, e.g. via
condaorvirtualenv:Python minimum requirement >= 3.7
sudo apt-get install python3-venv python3 -m venv env source env/bin/activate -
Clone this repository:
git clone https://github.com/mlcommons/algorithmic-efficiency.git cd algorithmic-efficiency -
We use pip to install the
algorithmic_efficiency.
TL;DR to install the Jax version for GPU run:
pip3 install -e '.[pytorch_cpu]' pip3 install -e '.[jax_gpu]' -f 'https://storage.googleapis.com/jax-releases/jax_cuda_releases.html' pip3 install -e '.[full]'TL;DR to install the PyTorch version for GPU run:
pip3 install -e '.[jax_cpu]' pip3 install -e '.[pytorch_gpu]' -f 'https://download.pytorch.org/whl/torch_stable.html' pip3 install -e '.[full]'You can also install the requirements for individual workloads, e.g. via
pip3 install -e '.[librispeech]'or all workloads at once via
pip3 install -e '.[full]'Depending on the framework you want to use (e.g. JAX or PyTorch) you need to install them as well. You could either do this manually or by adding the corresponding options:
JAX (GPU)
pip3 install -e '.[jax_gpu]' -f 'https://storage.googleapis.com/jax-releases/jax_cuda_releases.html'JAX (CPU)
pip3 install -e '.[jax_cpu]'PyTorch (GPU)
pip3 install -e '.[pytorch_gpu]' -f 'https://download.pytorch.org/whl/torch_stable.html'PyTorch (CPU)
pip3 install -e '.[pytorch_cpu]'Development
To use the development tools such as pytest or pylint use the dev option:
pip3 install -e '.[dev]'To get an installation with the requirements for all workloads and development, use the argument [full_dev].
Docker is the easiest way to enable PyTorch/JAX GPU support on Linux since only the NVIDIA® GPU driver is required on the host machine (the NVIDIA® CUDA® Toolkit does not need to be installed).
-
Install Docker on your local host machine.
-
For GPU support on Linux, install NVIDIA Docker support.
- Take note of your Docker version with docker -v. Versions earlier than 19.03 require nvidia-docker2 and the --runtime=nvidia flag. On versions including and after 19.03, you will use the nvidia-container-toolkit package and the --gpus all flag. Both options are documented on the page linked above.
-
Clone this repository:
git clone https://github.com/mlcommons/algorithmic-efficiency.git
-
Build Docker
cd algorithmic-efficiency/ && sudo docker build -t algorithmic-efficiency .
-
Run Docker
sudo docker run --gpus all -it --rm -v $PWD:/home/ubuntu/algorithmic-efficiency --ipc=host algorithmic-efficiencyCurrently docker method installs both PyTorch and JAX
python3 submission_runner.py \ --framework=jax \ --workload=mnist \ --submission_path=reference_submissions/mnist/mnist_jax/submission.py \ --tuning_search_space=reference_submissions/mnist/tuning_search_space.jsonpython3 submission_runner.py \ --framework=pytorch \ --workload=mnist \ --submission_path=reference_submissions/mnist/mnist_pytorch/submission.py \ --tuning_search_space=reference_submissions/mnist/tuning_search_space.jsonWhen using multiple GPUs on a single node it is recommended to use PyTorch's distributed data parallel. To do so, simply replace python3 by
torchrun --standalone --nnodes=1 --nproc_per_node=N_GPUSwhere N_GPUS is the number of available GPUs on the node.
The rules for the MLCommons Algorithmic Efficency benchmark can be found in the seperate rules document. Suggestions, clarifications and questions can be raised via pull requests.
If you are interested in contributing to the work of the working group, feel free to join the weekly meetings, open issues, and see the MLCommons contributing guidelines.
We run basic presubmit checks with GitHub Actions, configured in the .github/workflows folder.
To run the below commands, use the versions installed via pip install -e '.[dev]'.
To automatically fix formatting errors, run the following (WARNING: this will edit your code, so it is suggested to make a git commit first!):
yapf -i -r -vv -p algorithmic_efficiency baselines target_setting_runs reference_submissions tests *.pyTo print out all offending import orderings, run the following (you will need to manually make the edits, because reordering Python imports can cause side-effects):
isort . --check --diffTo print out all offending pylint issues, run the following:
pylint algorithmic_efficiency pylint baselines pylint target_setting_runs pylint reference_submissions pylint submission_runner.py pylint tests