trtmodel(max batch size =2) inference time spent about 2 times than trtmodel(max batch size =1) on convolution and activation layer

Environment:

TensorRT Version: 7.2.1
CUDA Version: 11.1
CUDNN Version: 8.0.4
Operating System + Version: ubuntu18.04
Python Version: 3.6.10
PyTorch Version: 1.7.0
##Description:
First, using the project mmdetection-to-tensorrt(link:https://github.com/grimoire/mmdetection-to-tensorrt) to convert our faster rcnn model(.pth file) directly convert to trtmodel(max batch size =1), the convert command is as follows:

Then, using same command but set max batch size =2, the convert command is as follows:

Third, using above converted trtmodels to infer with same image respectively, for trtmodel (with max batch size =2) the image repeated twice, adding the below code to record the layer time, then summarising the top20 layer time consuming for above two trtmodels as following table, from the table, for the network layers (with green identified) time consuming for the model (with max batch size =2) are almost double compared to the model(with max batch size =1), that seems unreasonable, because the network layers (with green identified) is convolution layer with tensor operation which can be considered parallel. Can you give me some suggestions? Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

trtmodel(max batch size =2) inference time spent about 2 times than trtmodel(max batch size =1) on convolution and activation layer #1046

Environment:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

trtmodel(max batch size =2) inference time spent about 2 times than trtmodel(max batch size =1) on convolution and activation layer #1046

Description

Environment:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions