File tree Expand file tree Collapse file tree 1 file changed +19
-3
lines changed Expand file tree Collapse file tree 1 file changed +19
-3
lines changed Original file line number Diff line number Diff line change @@ -52,13 +52,29 @@ FLAGS.model_name: tensorrt_llm
5252``` bash
5353 python3 tensorrt_llm/triton_backend/tools/inflight_batcher_llm/benchmark_core_model.py --max-input-len 500 \
5454 --tensorrt-llm-model-name tensorrt_llm \
55- --test-llmapi \
56- dataset --dataset ./tensorrt_llm/triton_backend/tools/dataset/mini_cnn_eval.json \
57- --tokenizer-dir meta-llama/Llama-3.1-8B
55+ --test-llmapi \
56+ dataset --dataset ./tensorrt_llm/triton_backend/tools/dataset/mini_cnn_eval.json \
57+ --tokenizer-dir meta-llama/Llama-3.1-8B
5858
5959dataset
6060Tokenizer: Tokens per word = 1.308
6161[INFO] Warm up for benchmarking.
6262[INFO] Start benchmarking on 39 prompts.
6363[INFO] Total Latency: 1446.623 ms
6464```
65+
66+ ### Start the server on a multi-node configuration
67+
68+ The ` srun ` tool can be used to start the server in a multi-node environment:
69+
70+ ```
71+ srun -N 2 \
72+ --ntasks-per-node=8 \
73+ --mpi=pmix \
74+ --container-image=<your image> \
75+ --container-mounts=$(pwd)/tensorrt_llm/:/code \
76+ bash /code/triton_backend/scripts/triton_task.sh
77+
78+ ```
79+
80+ Note: inter-node tensor parallelism is not yet supported.
You can’t perform that action at this time.
0 commit comments