Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
This pr is to support the use case of
where the 0th dimension of
inputcan be dynamic. I fixed a couple things in this prCurrently we remove the
symintsfrom theargsbut filter on the tensor type, This is actually problematic because the FX graph actually expectssymintsto be part of the input. This will likely incur the edge cases where the order/number of args are different from what fx expectsWe currently cache each graph by all input shapes. I took a profile and this can be expensive(~1ms for 16 layer decoder). Instead pytorch actually pass us the dynamic dimension as input so we can cache on that instead. If there is no dynamic dimension we will cache on an empty tuple which is cheap.
with
dynamic=Falseandmark_dynamiwe can no longer easily tell if current graph will be dynamic or not. Since with the2above, the caching will be very cheap, so I do thegraph_varlookup for every graph instead of dynamic graph onlyWe can not run the partitioner when there are symints as inputs because partitioner will put the symint and its return into a separate graph which mess up the dynamic dimension look up. For now I just skip partitioner when there is an symint inputs.