Groq

Supported Models

Explore all available models on GroqCloud.

Production Models

Note: Production models are intended for use in your production environments. They meet or exceed our high standards for speed, quality, and reliability. Read more here.

MODEL IDSPEED (T/SEC)PRICE PER 1M TOKENSRATE LIMITS (DEVELOPER PLAN)CONTEXT WINDOW (TOKENS)MAX COMPLETION TOKENSMAX FILE SIZE
MetaLlama 3.1 8Bllama-3.1-8b-instant
560
$0.05 input$0.08 output
250K TPM1K RPM
131,072
131,072
-
MetaLlama 3.3 70Bllama-3.3-70b-versatile
280
$0.59 input$0.79 output
300K TPM1K RPM
131,072
32,768
-
MetaLlama Guard 4 12Bmeta-llama/llama-guard-4-12b
1200
$0.20 input$0.20 output
30K TPM100 RPM
131,072
1,024
20 MB
OpenAIGPT OSS 120Bopenai/gpt-oss-120b
500
$0.15 input$0.75 output
250K TPM1K RPM
131,072
65,536
-
OpenAIGPT OSS 20Bopenai/gpt-oss-20b
1000
$0.10 input$0.50 output
250K TPM1K RPM
131,072
65,536
-
OpenAIWhisperwhisper-large-v3
-
$0.111 per hour
200K ASH300 RPM
-
-
100 MB
OpenAIWhisper Large V3 Turbowhisper-large-v3-turbo
-
$0.04 per hour
400K ASH400 RPM
-
-
100 MB

Production Systems

Systems are a collection of models and tools that work together to answer a user query.


MODEL IDSPEED (T/SEC)PRICE PER 1M TOKENSRATE LIMITS (DEVELOPER PLAN)CONTEXT WINDOW (TOKENS)MAX COMPLETION TOKENSMAX FILE SIZE
GroqCompoundgroq/compound
450
-
200K TPM200 RPM
131,072
8,192
-
GroqCompound Minigroq/compound-mini
450
-
200K TPM200 RPM
131,072
8,192
-

Learn More About Agentic Tooling
Discover how to build powerful applications with real-time web search and code execution

Preview Models

Note: Preview models are intended for evaluation purposes only and should not be used in production environments as they may be discontinued at short notice. Read more about deprecations here.

MODEL IDSPEED (T/SEC)PRICE PER 1M TOKENSRATE LIMITS (DEVELOPER PLAN)CONTEXT WINDOW (TOKENS)MAX COMPLETION TOKENSMAX FILE SIZE
MetaLlama 4 Maverick 17B 128Emeta-llama/llama-4-maverick-17b-128e-instruct
600
$0.20 input$0.60 output
300K TPM1K RPM
131,072
8,192
20 MB
MetaLlama 4 Scout 17B 16Emeta-llama/llama-4-scout-17b-16e-instruct
750
$0.11 input$0.34 output
300K TPM1K RPM
131,072
8,192
20 MB
MetaLlama Prompt Guard 2 22Mmeta-llama/llama-prompt-guard-2-22m
-
$0.03 input$0.03 output
30K TPM100 RPM
512
512
-
MetaPrompt Guard 2 86Mmeta-llama/llama-prompt-guard-2-86m
-
$0.04 input$0.04 output
30K TPM100 RPM
512
512
-
Moonshot AIKimi K2 0905moonshotai/kimi-k2-instruct-0905
200
$1.00 input$3.00 output
250K TPM1K RPM
262,144
16,384
-
PlayAIPlayAI TTSplayai-tts
-
$50.00 per 1M characters
50K TPM250 RPM
8,192
8,192
-
PlayAIPlayAI TTS Arabicplayai-tts-arabic
-
$50.00 per 1M characters
50K TPM250 RPM
8,192
8,192
-
Alibaba CloudQwen3-32Bqwen/qwen3-32b
400
$0.29 input$0.59 output
300K TPM1K RPM
131,072
40,960
-

Deprecated Models

Deprecated models are models that are no longer supported or will no longer be supported in the future. See our deprecation guidelines and deprecated models here.

Get All Available Models

Hosted models are directly accessible through the GroqCloud Models API endpoint using the model IDs mentioned above. You can use the https://api.groq.com/openai/v1/models endpoint to return a JSON list of all active models:

curl -X GET "https://api.groq.com/openai/v1/models" \  -H "Authorization: Bearer $GROQ_API_KEY" \  -H "Content-Type: application/json"

Was this page helpful?