Skip to content

Commit 08ed112

Browse files
authored
feat: update documentation for multi-node (#762)
* feat: update documentation for multi-node Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com> * fix markdown Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com> --------- Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
1 parent d07ec31 commit 08ed112

File tree

1 file changed

+19
-3
lines changed

1 file changed

+19
-3
lines changed

docs/llmapi.md

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -52,13 +52,29 @@ FLAGS.model_name: tensorrt_llm
5252
```bash
5353
python3 tensorrt_llm/triton_backend/tools/inflight_batcher_llm/benchmark_core_model.py --max-input-len 500 \
5454
--tensorrt-llm-model-name tensorrt_llm \
55-
--test-llmapi \
56-
dataset --dataset ./tensorrt_llm/triton_backend/tools/dataset/mini_cnn_eval.json \
57-
--tokenizer-dir meta-llama/Llama-3.1-8B
55+
--test-llmapi \
56+
dataset --dataset ./tensorrt_llm/triton_backend/tools/dataset/mini_cnn_eval.json \
57+
--tokenizer-dir meta-llama/Llama-3.1-8B
5858

5959
dataset
6060
Tokenizer: Tokens per word = 1.308
6161
[INFO] Warm up for benchmarking.
6262
[INFO] Start benchmarking on 39 prompts.
6363
[INFO] Total Latency: 1446.623 ms
6464
```
65+
66+
### Start the server on a multi-node configuration
67+
68+
The `srun` tool can be used to start the server in a multi-node environment:
69+
70+
```
71+
srun -N 2 \
72+
--ntasks-per-node=8 \
73+
--mpi=pmix \
74+
--container-image=<your image> \
75+
--container-mounts=$(pwd)/tensorrt_llm/:/code \
76+
bash /code/triton_backend/scripts/triton_task.sh
77+
78+
```
79+
80+
Note: inter-node tensor parallelism is not yet supported.

0 commit comments

Comments
 (0)