Explore all available models on GroqCloud.
Groq Compound is an AI system powered by openly available models that intelligently and selectively uses built-in tools to answer user queries, including web search and code execution.
GPT-OSS 120B is OpenAI's flagship open-weight language model with 120 billion parameters, built in browser search and code execution, and reasoning capabilities.
Note: Production models are intended for use in your production environments. They meet or exceed our high standards for speed, quality, and reliability. Read more here.
MODEL ID | SPEED (T/SEC) | PRICE PER 1M TOKENS | RATE LIMITS (DEVELOPER PLAN) | CONTEXT WINDOW (TOKENS) | MAX COMPLETION TOKENS | MAX FILE SIZE |
---|---|---|---|---|---|---|
![]() | 560 | $0.05 input$0.08 output | 250K TPM1K RPM | 131,072 | 131,072 | - |
![]() | 280 | $0.59 input$0.79 output | 300K TPM1K RPM | 131,072 | 32,768 | - |
![]() | 1200 | $0.20 input$0.20 output | 30K TPM100 RPM | 131,072 | 1,024 | 20 MB |
500 | $0.15 input$0.75 output | 250K TPM1K RPM | 131,072 | 65,536 | - | |
1000 | $0.10 input$0.50 output | 250K TPM1K RPM | 131,072 | 65,536 | - | |
- | $0.111 per hour | 200K ASH300 RPM | - | - | 100 MB | |
- | $0.04 per hour | 400K ASH400 RPM | - | - | 100 MB |
Systems are a collection of models and tools that work together to answer a user query.
MODEL ID | SPEED (T/SEC) | PRICE PER 1M TOKENS | RATE LIMITS (DEVELOPER PLAN) | CONTEXT WINDOW (TOKENS) | MAX COMPLETION TOKENS | MAX FILE SIZE |
---|---|---|---|---|---|---|
![]() | 450 | - | 200K TPM200 RPM | 131,072 | 8,192 | - |
![]() | 450 | - | 200K TPM200 RPM | 131,072 | 8,192 | - |
Note: Preview models are intended for evaluation purposes only and should not be used in production environments as they may be discontinued at short notice. Read more about deprecations here.
MODEL ID | SPEED (T/SEC) | PRICE PER 1M TOKENS | RATE LIMITS (DEVELOPER PLAN) | CONTEXT WINDOW (TOKENS) | MAX COMPLETION TOKENS | MAX FILE SIZE |
---|---|---|---|---|---|---|
![]() | 600 | $0.20 input$0.60 output | 300K TPM1K RPM | 131,072 | 8,192 | 20 MB |
![]() | 750 | $0.11 input$0.34 output | 300K TPM1K RPM | 131,072 | 8,192 | 20 MB |
![]() | - | $0.03 input$0.03 output | 30K TPM100 RPM | 512 | 512 | - |
![]() | - | $0.04 input$0.04 output | 30K TPM100 RPM | 512 | 512 | - |
![]() | 200 | $1.00 input$3.00 output | 250K TPM1K RPM | 262,144 | 16,384 | - |
- | $50.00 per 1M characters | 50K TPM250 RPM | 8,192 | 8,192 | - | |
- | $50.00 per 1M characters | 50K TPM250 RPM | 8,192 | 8,192 | - | |
![]() | 400 | $0.29 input$0.59 output | 300K TPM1K RPM | 131,072 | 40,960 | - |
Deprecated models are models that are no longer supported or will no longer be supported in the future. See our deprecation guidelines and deprecated models here.
Hosted models are directly accessible through the GroqCloud Models API endpoint using the model IDs mentioned above. You can use the https://api.groq.com/openai/v1/models
endpoint to return a JSON list of all active models:
curl -X GET "https://api.groq.com/openai/v1/models" \ -H "Authorization: Bearer $GROQ_API_KEY" \ -H "Content-Type: application/json"