Skip to content

Commit bd257e1

Browse files
Merge pull request NVIDIA#920 from NVIDIA/gh/release
Gh/release
2 parents 5b9787c + 169b081 commit bd257e1

File tree

173 files changed

+124560
-1540
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

173 files changed

+124560
-1540
lines changed

PyTorch/Classification/ConvNets/README.md

Lines changed: 21 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ The following table provides links to where you can find additional information
3030
## Validation accuracy results
3131

3232
Our results were obtained by running the applicable
33-
training scripts in the [framework-container-name] NGC container
33+
training scripts in the 20.12 PyTorch NGC container
3434
on NVIDIA DGX-1 with (8x V100 16GB) GPUs.
3535
The specific training script that was run is documented
3636
in the corresponding model's README.
@@ -56,49 +56,48 @@ three classification models side-by-side.
5656

5757

5858
Our results were obtained by running the applicable
59-
training scripts in the pytorch-20.12 NGC container
59+
training scripts in the 21.03 PyTorch NGC container
6060
on NVIDIA DGX A100 with (8x A100 80GB) GPUs.
6161
Performance numbers (in images per second)
6262
were averaged over an entire training epoch.
6363
The specific training script that was run is documented
6464
in the corresponding model's README.
6565

66-
The following table shows the training accuracy results of the
67-
three classification models side-by-side.
68-
66+
The following table shows the training accuracy results of
67+
all the classification models side-by-side.
6968

7069
| **Model** | **Mixed Precision** | **TF32** | **Mixed Precision Speedup** |
7170
|:----------------------:|:-------------------:|:----------:|:---------------------------:|
72-
| efficientnet-b0 | 14391 img/s | 8225 img/s | 1.74 x |
73-
| efficientnet-b4 | 2341 img/s | 1204 img/s | 1.94 x |
74-
| efficientnet-widese-b0 | 15053 img/s | 8233 img/s | 1.82 x |
75-
| efficientnet-widese-b4 | 2339 img/s | 1202 img/s | 1.94 x |
76-
| resnet50 | 15977 img/s | 7365 img/s | 2.16 x |
77-
| resnext101-32x4d | 7399 img/s | 3193 img/s | 2.31 x |
78-
| se-resnext101-32x4d | 5248 img/s | 2665 img/s | 1.96 x |
71+
| efficientnet-b0 | 16652 img/s | 8193 img/s | 2.03 x |
72+
| efficientnet-b4 | 2570 img/s | 1223 img/s | 2.1 x |
73+
| efficientnet-widese-b0 | 16368 img/s | 8244 img/s | 1.98 x |
74+
| efficientnet-widese-b4 | 2585 img/s | 1223 img/s | 2.11 x |
75+
| resnet50 | 16621 img/s | 7248 img/s | 2.29 x |
76+
| resnext101-32x4d | 7925 img/s | 3471 img/s | 2.28 x |
77+
| se-resnext101-32x4d | 5779 img/s | 2991 img/s | 1.93 x |
7978

8079
### Training performance: NVIDIA DGX-1 16G (8x V100 16GB)
8180

8281
Our results were obtained by running the applicable
83-
training scripts in the pytorch-20.12 NGC container
82+
training scripts in the 21.03 PyTorch NGC container
8483
on NVIDIA DGX-1 with (8x V100 16GB) GPUs.
8584
Performance numbers (in images per second)
8685
were averaged over an entire training epoch.
8786
The specific training script that was run is documented
8887
in the corresponding model's README.
8988

90-
The following table shows the training accuracy results of the
91-
three classification models side-by-side.
89+
The following table shows the training accuracy results of all the
90+
classification models side-by-side.
9291

9392
| **Model** | **Mixed Precision** | **FP32** | **Mixed Precision Speedup** |
9493
|:----------------------:|:-------------------:|:----------:|:---------------------------:|
95-
| efficientnet-b0 | 7664 img/s | 4571 img/s | 1.67 x |
96-
| efficientnet-b4 | 1330 img/s | 598 img/s | 2.22 x |
97-
| efficientnet-widese-b0 | 7694 img/s | 4489 img/s | 1.71 x |
98-
| efficientnet-widese-b4 | 1323 img/s | 590 img/s | 2.24 x |
99-
| resnet50 | 7608 img/s | 2851 img/s | 2.66 x |
100-
| resnext101-32x4d | 3742 img/s | 1117 img/s | 3.34 x |
101-
| se-resnext101-32x4d | 2716 img/s | 994 img/s | 2.73 x |
94+
| efficientnet-b0 | 7789 img/s | 4672 img/s | 1.66 x |
95+
| efficientnet-b4 | 1366 img/s | 616 img/s | 2.21 x |
96+
| efficientnet-widese-b0 | 7875 img/s | 4592 img/s | 1.71 x |
97+
| efficientnet-widese-b4 | 1356 img/s | 612 img/s | 2.21 x |
98+
| resnet50 | 8322 img/s | 2855 img/s | 2.91 x |
99+
| resnext101-32x4d | 4065 img/s | 1133 img/s | 3.58 x |
100+
| se-resnext101-32x4d | 2971 img/s | 1004 img/s | 2.95 x |
102101

103102
## Model Comparison
104103

0 commit comments

Comments
 (0)