Skip to content

Commit b96dd66

Browse files
committed
Warning for dataset download & extract race condition
See #8707 and pytorch/pytorch#68320.
1 parent 4249b61 commit b96dd66

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

docs/source/datasets.rst

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,15 @@ All the datasets have almost similar API. They all have two common arguments:
2727
``transform`` and ``target_transform`` to transform the input and target respectively.
2828
You can also create your own datasets using the provided :ref:`base classes <base_classes_datasets>`.
2929

30+
.. warning::
31+
32+
When built-in datasets are created at the given root directory for the first time,
33+
the main process has to download and extract the files in serial. This step is not
34+
safe to execute in parallel and will break when multiple processes are launched for
35+
distributed training. As a workaround, launch a single process to create the dataset
36+
at the given root directory first, terminate it once it is done downloading and
37+
extracting the files, and then launch the distributed training.
38+
3039
Image classification
3140
~~~~~~~~~~~~~~~~~~~~
3241

0 commit comments

Comments
 (0)