@@ -30,7 +30,7 @@ The following table provides links to where you can find additional information
3030## Validation accuracy results
3131
3232Our results were obtained by running the applicable
33- training scripts in the [ framework-container-name ] NGC container
33+ training scripts in the 20.12 PyTorch NGC container
3434on NVIDIA DGX-1 with (8x V100 16GB) GPUs.
3535The specific training script that was run is documented
3636in the corresponding model's README.
@@ -56,49 +56,48 @@ three classification models side-by-side.
5656
5757
5858Our results were obtained by running the applicable
59- training scripts in the pytorch-20.12 NGC container
59+ training scripts in the 21.03 PyTorch NGC container
6060on NVIDIA DGX A100 with (8x A100 80GB) GPUs.
6161Performance numbers (in images per second)
6262were averaged over an entire training epoch.
6363The specific training script that was run is documented
6464in the corresponding model's README.
6565
66- The following table shows the training accuracy results of the
67- three classification models side-by-side.
68-
66+ The following table shows the training accuracy results of
67+ all the classification models side-by-side.
6968
7069| ** Model** | ** Mixed Precision** | ** TF32** | ** Mixed Precision Speedup** |
7170| :----------------------:| :-------------------:| :----------:| :---------------------------:|
72- | efficientnet-b0 | 14391 img/s | 8225 img/s | 1.74 x |
73- | efficientnet-b4 | 2341 img/s | 1204 img/s | 1.94 x |
74- | efficientnet-widese-b0 | 15053 img/s | 8233 img/s | 1.82 x |
75- | efficientnet-widese-b4 | 2339 img/s | 1202 img/s | 1.94 x |
76- | resnet50 | 15977 img/s | 7365 img/s | 2.16 x |
77- | resnext101-32x4d | 7399 img/s | 3193 img/s | 2.31 x |
78- | se-resnext101-32x4d | 5248 img/s | 2665 img/s | 1.96 x |
71+ | efficientnet-b0 | 16652 img/s | 8193 img/s | 2.03 x |
72+ | efficientnet-b4 | 2570 img/s | 1223 img/s | 2.1 x |
73+ | efficientnet-widese-b0 | 16368 img/s | 8244 img/s | 1.98 x |
74+ | efficientnet-widese-b4 | 2585 img/s | 1223 img/s | 2.11 x |
75+ | resnet50 | 16621 img/s | 7248 img/s | 2.29 x |
76+ | resnext101-32x4d | 7925 img/s | 3471 img/s | 2.28 x |
77+ | se-resnext101-32x4d | 5779 img/s | 2991 img/s | 1.93 x |
7978
8079### Training performance: NVIDIA DGX-1 16G (8x V100 16GB)
8180
8281Our results were obtained by running the applicable
83- training scripts in the pytorch-20.12 NGC container
82+ training scripts in the 21.03 PyTorch NGC container
8483on NVIDIA DGX-1 with (8x V100 16GB) GPUs.
8584Performance numbers (in images per second)
8685were averaged over an entire training epoch.
8786The specific training script that was run is documented
8887in the corresponding model's README.
8988
90- The following table shows the training accuracy results of the
91- three classification models side-by-side.
89+ The following table shows the training accuracy results of all the
90+ classification models side-by-side.
9291
9392| ** Model** | ** Mixed Precision** | ** FP32** | ** Mixed Precision Speedup** |
9493| :----------------------:| :-------------------:| :----------:| :---------------------------:|
95- | efficientnet-b0 | 7664 img/s | 4571 img/s | 1.67 x |
96- | efficientnet-b4 | 1330 img/s | 598 img/s | 2.22 x |
97- | efficientnet-widese-b0 | 7694 img/s | 4489 img/s | 1.71 x |
98- | efficientnet-widese-b4 | 1323 img/s | 590 img/s | 2.24 x |
99- | resnet50 | 7608 img/s | 2851 img/s | 2.66 x |
100- | resnext101-32x4d | 3742 img/s | 1117 img/s | 3.34 x |
101- | se-resnext101-32x4d | 2716 img/s | 994 img/s | 2.73 x |
94+ | efficientnet-b0 | 7789 img/s | 4672 img/s | 1.66 x |
95+ | efficientnet-b4 | 1366 img/s | 616 img/s | 2.21 x |
96+ | efficientnet-widese-b0 | 7875 img/s | 4592 img/s | 1.71 x |
97+ | efficientnet-widese-b4 | 1356 img/s | 612 img/s | 2.21 x |
98+ | resnet50 | 8322 img/s | 2855 img/s | 2.91 x |
99+ | resnext101-32x4d | 4065 img/s | 1133 img/s | 3.58 x |
100+ | se-resnext101-32x4d | 2971 img/s | 1004 img/s | 2.95 x |
102101
103102## Model Comparison
104103
0 commit comments