Hi, I would like to clarify TensorRT APIs.
When converting from tensorflow model, there can be two outputs.
A UFF file from “uff.from_tensorflow()” and an engine file from “trt.utils.write_engine_to_file()” (see codes below).
Which one is used in C++ API?
uff_model = uff.from_tensorflow(output_graph, output_names, output_filename='abc.uff') # UFF parser = uffparser.create_uff_parser() parser.register_input(...) parser.register_output(...) engine = trt.utils.uff_to_trt_engine(G_LOGGER, uff_model, parser, 10, 1 << 20) parser.destroy() trt.utils.write_engine_to_file("abc.engine", engine.serialize()) # engine In “sampleUffMNIST.cpp”, it says,
ICudaEngine* engine = loadModelAndCreateEngine("lenet5.uff", maxBatchSize, parser); Does this mean C++ API use UFF files, not engine files?
Is there some C++ APIs to load an engine file and use it for inference?