Skip to content

Conversation

@liji-nv
Copy link
Collaborator

@liji-nv liji-nv commented Jul 3, 2025

PR title

Please write the PR title by following template:

[JIRA ticket link/nvbug link/github issue link][fix/feat/doc/infra/...] <summary of this PR>

For example, assume I have a PR hope to support a new feature about cache manager of Jira TRTLLM-1000 ticket, it would be like

[TRTLLM-1000][feat] Support a new feature about cache manager

Description

Please explain the issue and the solution in short.

Test Coverage

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

Details

run [--disable-fail-fast --skip-test --stage-list "A10-1, xxx" --gpu-type "A30, H100_PCIe" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-[Post-Merge]-1, xxx"]

Launch build/test pipelines. All previously running jobs will be killed.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests. Will also run L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-[Post-Merge]-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-[Post-Merge]-1, xxx".

For guidance on mapping tests to stage names, see docs/source/reference/ci-overview.md.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

@liji-nv
Copy link
Collaborator Author

liji-nv commented Jul 3, 2025

/bot run --stage-list DGX_H100-4_GPUs-PyTorch-DeepSeek-1

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10784 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10784 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7963 (Partly Tested) completed with status: 'SUCCESS'

@liji-nv
Copy link
Collaborator Author

liji-nv commented Jul 4, 2025

/bot run --stage-list DGX_H100-4_GPUs-PyTorch-DeepSeek-1

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10904 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10904 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8058 (Partly Tested) completed with status: 'SUCCESS'

@liji-nv
Copy link
Collaborator Author

liji-nv commented Jul 4, 2025

/bot run --stage-list DGX_H100-4_GPUs-PyTorch-DeepSeek-1

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10942 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10942 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8089 (Partly Tested) completed with status: 'SUCCESS'

@liji-nv
Copy link
Collaborator Author

liji-nv commented Jul 4, 2025

/bot run --stage-list DGX_H100-4_GPUs-PyTorch-DeepSeek-1

@liji-nv
Copy link
Collaborator Author

liji-nv commented Jul 4, 2025

/bot run --stage-list DGX_B200-4_GPUs-PyTorch-Post-Merge-1

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10959 [ run ] triggered by Bot

@liji-nv
Copy link
Collaborator Author

liji-nv commented Jul 4, 2025

/bot -h

@github-actions
Copy link

github-actions bot commented Jul 4, 2025

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

Details

run [--disable-fail-fast --skip-test --stage-list "A10-1, xxx" --gpu-type "A30, H100_PCIe" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-[Post-Merge]-1, xxx"]

Launch build/test pipelines. All previously running jobs will be killed.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests. Will also run L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-[Post-Merge]-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-[Post-Merge]-1, xxx".

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10959 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8098 (Partly Tested) completed with status: 'SUCCESS'

@liji-nv
Copy link
Collaborator Author

liji-nv commented Jul 4, 2025

/bot run --stage-list DGX_H100-4_GPUs-PyTorch-DeepSeek-1

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10993 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10993 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #8122 (Partly Tested) completed with status: 'FAILURE'

@liji-nv
Copy link
Collaborator Author

liji-nv commented Jul 4, 2025

/bot run --stage-list DGX_H100-4_GPUs-PyTorch-DeepSeek-1

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11001 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11001 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8128 (Partly Tested) completed with status: 'SUCCESS'

@liji-nv
Copy link
Collaborator Author

liji-nv commented Jul 7, 2025

/bot run --stage-list DGX_H100-4_GPUs-PyTorch-DeepSeek-1 --disable-reuse-test

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11085 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11085 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8196 (Partly Tested) completed with status: 'SUCCESS'

@liji-nv
Copy link
Collaborator Author

liji-nv commented Jul 7, 2025

/bot run --stage-list DGX_H100-4_GPUs-PyTorch-DeepSeek-1 --disable-reuse-test

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11122 [ run ] triggered by Bot

liji-nv added 2 commits July 7, 2025 02:13
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
@liji-nv liji-nv force-pushed the dev-liji-unwaive-intermitent branch 2 times, most recently from 211eb0c to b1dee3c Compare July 7, 2025 09:16
@liji-nv
Copy link
Collaborator Author

liji-nv commented Jul 7, 2025

/bot run

@liji-nv liji-nv changed the title [fix] https://nvbugs/5333654 Unwaive to check ci status [fix] https://nvbugs/5333654 Unwaive to check ci status and improve torch compile multi-gpu coverage Jul 7, 2025
@tensorrt-cicd
Copy link
Collaborator

PR_Github #11129 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11122 [ run ] completed with state ABORTED

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11129 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #8228 completed with status: 'FAILURE'

Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
@liji-nv liji-nv force-pushed the dev-liji-unwaive-intermitent branch from b1dee3c to 6a75ffe Compare July 7, 2025 10:10
@liji-nv
Copy link
Collaborator Author

liji-nv commented Jul 7, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11138 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #11138 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #8234 completed with status: 'SUCCESS'

@liji-nv liji-nv enabled auto-merge (squash) July 8, 2025 04:41
@liji-nv liji-nv merged commit 95978e3 into NVIDIA:main Jul 8, 2025
3 checks passed
zhou-yuxin pushed a commit to zhou-yuxin/TensorRT-LLM that referenced this pull request Jul 15, 2025
…orch compile multi-gpu coverage (NVIDIA#5700) Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com> Signed-off-by: Yuxin <yuxinz@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants