Skip to content

Conversation

@DylanChen-NV
Copy link
Collaborator

@DylanChen-NV DylanChen-NV commented Jun 30, 2025

[TRTLLM-5812][feat] support FP8 row-wise dense GEMM in torch flow

Description

Adapter existing FP8 row-wise dense GEMM kernels (sm89/sm90/sm120) to torch workflow.
Support FP8 row-wise GEMM as a TunableRunner for better performance.

Test Coverage

tests/unittest/_torch/thop/test_fp8_rowwise_linear.py: Test the output correctness of row-wise torch op

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

Details

run [--disable-fail-fast --skip-test --stage-list "A10-1, xxx" --gpu-type "A30, H100_PCIe" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-[Post-Merge]-1, xxx"]

Launch build/test pipelines. All previously running jobs will be killed.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests. Will also run L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-[Post-Merge]-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-[Post-Merge]-1, xxx".

For guidance on mapping tests to stage names, see docs/source/reference/ci-overview.md.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

@DylanChen-NV DylanChen-NV requested a review from a team as a code owner June 30, 2025 13:47
@DylanChen-NV DylanChen-NV changed the title [TRTLLM-5812][feat] support rowwise in torch flow [TRTLLM-5812][feat] support FP8 row-wise GEMM in torch flow Jun 30, 2025
@DylanChen-NV DylanChen-NV changed the title [TRTLLM-5812][feat] support FP8 row-wise GEMM in torch flow [TRTLLM-5812][feat] support FP8 row-wise dense GEMM in torch flow Jun 30, 2025
@juney-nvidia juney-nvidia requested review from Naveassaf and Tracin and removed request for dongxuy04 and liji-nv June 30, 2025 20:23
@DylanChen-NV DylanChen-NV force-pushed the row_wise_torch_flow branch from 20840d9 to 41579d6 Compare July 1, 2025 08:01
@DylanChen-NV
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10461 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10461 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #7742 completed with status: 'FAILURE'

@DylanChen-NV
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10464 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10464 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #7746 completed with status: 'FAILURE'

@DylanChen-NV
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10473 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10473 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7751 completed with status: 'FAILURE'

@DylanChen-NV
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10503 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10503 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7776 completed with status: 'FAILURE'

@DylanChen-NV
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10512 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10512 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7785 completed with status: 'FAILURE'

@DylanChen-NV DylanChen-NV force-pushed the row_wise_torch_flow branch from e1a100c to f6230e9 Compare July 2, 2025 03:19
@DylanChen-NV
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10562 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10562 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7816 completed with status: 'SUCCESS'

Copy link
Collaborator

@tomeras91 tomeras91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Added a few nit comments.

What's more important: Can you edit the PR description, remove the irrelevant info from the description template and add important information on what is added in this PR?

@DylanChen-NV
Copy link
Collaborator Author

Overall LGTM. Added a few nit comments.

What's more important: Can you edit the PR description, remove the irrelevant info from the description template and add important information on what is added in this PR?

@tomeras91 Thanks for the comments. I’ve implemented all the suggested changes. Please feel free to approve or let me know if further changes are needed.

@DylanChen-NV DylanChen-NV force-pushed the row_wise_torch_flow branch from 11461a1 to d4950c7 Compare July 7, 2025 02:31
@DylanChen-NV
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11086 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11086 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8195 completed with status: 'SUCCESS'

@DylanChen-NV DylanChen-NV force-pushed the row_wise_torch_flow branch from d4950c7 to cea6589 Compare July 7, 2025 09:24
@DylanChen-NV
Copy link
Collaborator Author

/bot reuse-pipeline

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11131 [ reuse-pipeline ] triggered by Bot

@byshiue byshiue enabled auto-merge (squash) July 7, 2025 09:38
@tensorrt-cicd
Copy link
Collaborator

PR_Github #11131 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #11086 for commit cea6589

Signed-off-by: Dylan Chen <191843203+DylanChen-NV@users.noreply.github.com>
Signed-off-by: Dylan Chen <191843203+DylanChen-NV@users.noreply.github.com>
Signed-off-by: Dylan Chen <191843203+DylanChen-NV@users.noreply.github.com>
Signed-off-by: Dylan Chen <191843203+DylanChen-NV@users.noreply.github.com>
Signed-off-by: Dylan Chen <191843203+DylanChen-NV@users.noreply.github.com>
Signed-off-by: Dylan Chen <191843203+DylanChen-NV@users.noreply.github.com>
Signed-off-by: Dylan Chen <191843203+DylanChen-NV@users.noreply.github.com>
Signed-off-by: Dylan Chen <191843203+DylanChen-NV@users.noreply.github.com>
Signed-off-by: Dylan Chen <191843203+DylanChen-NV@users.noreply.github.com>
Signed-off-by: Dylan Chen <191843203+DylanChen-NV@users.noreply.github.com>
Signed-off-by: Dylan Chen <191843203+DylanChen-NV@users.noreply.github.com>
@DylanChen-NV DylanChen-NV force-pushed the row_wise_torch_flow branch from cea6589 to 819c9b8 Compare July 7, 2025 09:47
@DylanChen-NV
Copy link
Collaborator Author

/bot reuse-pipeline

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11133 [ reuse-pipeline ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11133 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #11086 for commit 819c9b8

@byshiue byshiue merged commit 5ca2b9b into NVIDIA:main Jul 7, 2025
3 checks passed
zhou-yuxin pushed a commit to zhou-yuxin/TensorRT-LLM that referenced this pull request Jul 15, 2025
…IDIA#5615) Signed-off-by: Dylan Chen <191843203+DylanChen-NV@users.noreply.github.com> Signed-off-by: Yuxin <yuxinz@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

5 participants