Skip to content

Conversation

@dtrifiro
Copy link

@dtrifiro dtrifiro commented May 9, 2025

This depends on https://github.com/neuralmagic/nm-cicd/pull/103 to allow for the accuracy/guidellm workflows to actually use the accelerator-specific overrides

  • mistralai/Mixtral-8x7B-Instruct-v0.1: add rocm accuracy server override (avoids using tensor-parallel=8)
  • Llama-3.1-8B-Instruct add accuracy/server-rocm (set gpu_memory_utilization to avoid OOM errors)
@@ -0,0 +1,4 @@
# https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
model: 'mistralai/Mixtral-8x7B-Instruct-v0.1'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The model config here can be removed.

@dtrifiro dtrifiro force-pushed the rocm-configs branch 3 times, most recently from cec96c2 to bae1481 Compare May 12, 2025 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants