Description
Inference time linear proportionality with batch size while using Tensorrt engine for scaledyolov4 for object detection(scaled yolov4).
A clear and concise description of the bug or issue.
When I am increasing batch size, inference time is increasing linearly.
Environment
TensorRT Version:
Checked on two versions (7.2.2 and 7.0.0)
GPU Type:
Tesla T4
Nvidia Driver Version:
455
CUDA Version:
7.2.2 with cuda-11.1 and 7.0.0 with cuda-10.2
CUDNN Version:
7 with trt-7.0.0 and 8 with trt-7.2.2
Operating System + Version:
ubuntu-18.04
Python Version (if applicable):
3.6.9
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):
nvcr.io/nvidia/tensorrt:20.12-py3 - trt-7.2.2
nvcr.io/nvidia/tensorrt:20.03-py3 - trt-7.0.0
FOR BATCH SIZE - 1:
Inference take: 48.5283 ms. Inference take: 48.518 ms. Inference take: 40.1897 ms. Inference take: 40.0713 ms. Inference take: 38.54 ms. Inference take: 38.7829 ms. Inference take: 38.6083 ms. Inference take: 38.6635 ms. Inference take: 38.1827 ms. Inference take: 38.1016 ms FOR BATCH SIZE - 2:
Inference take: 76.3045 ms. Inference take: 74.9346 ms. Inference take: 73.3341 ms. Inference take: 73.9554 ms. Inference take: 73.4185 ms. Inference take: 75.4546 ms. Inference take: 77.7809 ms. Inference take: 78.3289 ms. Inference take: 79.5533 ms. Inference take: 79.0556 ms. Inference take: 79.2939 ms. Inference take: 77.214 ms. FOR BATCH SIZE - 4:
Inference take: 158.327 ms. Inference take: 157.001 ms. Inference take: 157.107 ms. Inference take: 154.237 ms. Inference take: 155.899 ms. Inference take: 157.408 ms. Inference take: 155.758 ms. Inference take: 155.906 ms. I expected batch size not to have this proportionality. Can anything done to improve the inference time using batching?
TIY.