Description
Hi,
I’m trying to convert models from PyTorch → ONNX → TensorRT. Optimally, I would like to use INT8 and support dynamic input size.
I seem to be able to create an INT8 calibrated model if I use builder.build_cuda_engine(network) and use optimization profiles for dynamic input support if I use builder.build_engine(network, config).
The latter option seems to always ignore the int8_calibrator regardless if I set it in the builder or the config objects and even if I remove the dynamic shape optimizations (see code snippet below).
Please let me know if what I’m trying here is not supported or any other way to make this work…
Thanks!
Environment
TensorRT Version:
GPU Type: T4
Nvidia Driver Version:
CUDA Version:
CUDNN Version:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/pytorch:20.11-py3
Relevant Files
Steps To Reproduce
def build_engine(onnx_file_path, input_name, int8_calibrator=None, max_batch_size=1, img_size=None, min_size=None, max_size=None): # initialize TensorRT engine and parse ONNX model with trt.Builder(TRT_LOGGER) as builder, builder.create_builder_config() as config: builder = trt.Builder(TRT_LOGGER) network_creation_flag = 1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH) network = builder.create_network(network_creation_flag) parser = trt.OnnxParser(network, TRT_LOGGER) # parse ONNX with open(onnx_file_path, 'rb') as model: print('Beginning ONNX file parsing') parser.parse(model.read()) print('Completed parsing of ONNX file') # allow TensorRT to use up to 8GB of GPU memory for tactic selection config.max_workspace_size = 8 << 30 # use FP16 mode if possible if builder.platform_has_fast_fp16: builder.fp16_mode = True print('USING FP16!!!') if int8_calibrator is not None: builder.int8_mode = True config.int8_calibrator = int8_calibrator builder.int8_calibrator = int8_calibrator print('USING INT8!!!', builder.platform_has_fast_int8) # # Dynamic input support - commented out for testing (still int8 calibration is not working) # if img_size is not None: # dynamic # opt_min, opt_max = min(img_size), max(img_size) # # landscape profile # profile = builder.create_optimization_profile() # profile.set_shape(input_name, min=(1, 3, min_size, opt_max), opt=(max_batch_size, 3, opt_min, opt_max), # max=(max_batch_size, 3, opt_max, opt_max)) # config.add_optimization_profile(profile) # # # portrait profile # profile = builder.create_optimization_profile() # profile.set_shape(input_name, min=(1, 3, opt_max, min_size), opt=(max_batch_size, 3, opt_max, opt_min), # max=(max_batch_size, 3, opt_max, opt_max)) # config.add_optimization_profile(profile) # generate TensorRT engine optimized for the target platform print('Building an engine...') # engine = builder.build_cuda_engine(network) engine = builder.build_engine(network, config) print("Completed creating Engine") return engine