Skip to content

Conversation

@fpetrini15
Copy link
Contributor

@fpetrini15 fpetrini15 commented Aug 1, 2024

SAs have reported issues with the current state of the the tutorial where, when trying to launch a Triton server with the models built in this tutorial, they encounter the following error:

tritonserver.InvalidArgumentError: load failed for model 'stable_diffusion_xl': version 1 is at UNAVAILABLE state: Internal: AttributeError: 'tensorrt_bindings.tensorrt.ICudaEngine' object has no attribute 'get_binding_dtype'

Further investigation discovered that the version of TRT being installed in the generated image was 10.2 instead of the intended 9.2. There is no 9.2.0 version, so we select the latest version available in the 9.2.X series.

Confirmed both with the SA and through a tutorial walkthrough that this resolves the issue and enables the server to launch and perform inference successfully.

@fpetrini15 fpetrini15 requested a review from rmccorm4 August 1, 2024 17:21
Copy link
Contributor

@rmccorm4 rmccorm4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough description 🚀

@fpetrini15 fpetrini15 merged commit 304e64a into main Aug 1, 2024
@fpetrini15 fpetrini15 deleted the fpetrini-stable-diffusion-trt-fix branch August 1, 2024 17:41
@nv-kmcgill53
Copy link

This is a brittle change. If the goal of fixing the tutorial is to grab the latest 9.2.X then we want to specify the min and max versions we allow to be installed. I think it would be something like:

tensorrt>=9.2,<10.0 OR tensorrt>=9.2,<9.3 

That way we can pick up any other releases to the 9.2 branch which may be patched.

@fpetrini15
Copy link
Contributor Author

That's a neat idea Kyle, I hadn't thought of this!

I'm wondering though, in the case of a tutorial, is it better to lock it in to a specific version that we've actually tested and verified? For example, we now use 9.2.0.post12.dev5 -- and know it works. If 9.2.0.post12.dev6 came out tomorrow, I couldn't guarantee it works, even if it's extremely likely that it would. I figure, above all else, a tutorial needs to work consistently from start to finish and locking to a specific version prevents us from having to come back an maintain things.

@nv-kmcgill53
Copy link

nv-kmcgill53 commented Sep 10, 2024

Looking again at this PR, I think I would prefer my >< idea since I am not sure if the TRT team is deprecating older builds off our nvidia pypi. If in the future this pinned version is deprecated, then we have the same issue of the tutorial breaking and having to solve the same issue again. You do make a good point about verification of the tutorial and ensuring we can reproduce errors/functionality but (and this is strictly IMO) this type of tutorial stability is measured by "how many times do we have to touch it" rather than "do we ensure build consistency." I believe this >< idea does this better but I don't have any metrics to support this. I am willing to change my opinion on this if we can prove otherwise.

As a NIT which doesn't need to be addressed, I would prefer this tutorial to make use of the TRT 10 just because TRT 9 got rolled up into that and we could potentially have a more consistent tutorial base where users don't have to wonder why some are running on a lower major version than the rest.

@fpetrini15
Copy link
Contributor Author

fpetrini15 commented Sep 10, 2024

Hmm you bring up a good point with the deprecation.

this type of tutorial stability is measured by "how many times do we have to touch it"

We're using the same metric, I think we're just on different sides of the fence 😆. I guess the question is which is more likely:

  • The current pinned version will eventually be deprecated from pypi
  • The version installed with the >< strategy will break the tutorial.

I think you're right in that the former is more likely.

Regarding not using TRT 10, I guess I'm less concerned because tutorials generally stop using the latest and greatest versions of things unless tediously maintained.

I'll throw up PR soon and see if can bump to TRT 10: TPRD-461

fdf3d186-88d5 pushed a commit to fdf3d186-88d5/triton-inference-server that referenced this pull request Mar 21, 2025
Pin TensorRT Version in Stable Diffusion Tutorial
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

4 participants