Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

Full-text search

Active filters: text-generation-inference

google/functiongemma-270m-it

Text Generation • 0.3B • Updated 5 days ago • 21.1k • 511

Nanbeige/Nanbeige4-3B-Thinking-2511

Text Generation • 4B • Updated 6 days ago • 5.68k • 138

nvidia/Nemotron-Cascade-14B-Thinking

Text Generation • 15B • Updated 5 days ago • 2.04k • 43

nvidia/Nemotron-Cascade-8B

Text Generation • 8B • Updated 5 days ago • 668 • 40

ekwek/Soprano-80M

Text-to-Speech • 79.7M • Updated 1 day ago • 324 • 31

ByteDance/Dolphin-v2

Image-Text-to-Text • 4B • Updated 12 days ago • 1.63k • 93

EssentialAI/rnj-1-instruct

Text Generation • 8B • Updated 3 days ago • 454k • • 287

meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 10.6M • • 5.16k

google/gemma-3-27b-it

Image-Text-to-Text • 27B • Updated Mar 21 • 1.64M • • 1.76k

google/gemma-3-4b-it

Image-Text-to-Text • 4B • Updated Mar 21 • 894k • 1.06k

Qwen/Qwen3-0.6B

Text Generation • 0.8B • Updated Jul 26 • 8.04M • • 904

utter-project/EuroLLM-22B-Instruct-2512

Text Generation • 23B • Updated 8 days ago • 1.27k • • 29

nvidia/Nemotron-Cascade-8B-Thinking

Text Generation • 8B • Updated 5 days ago • 985 • 24

Qwen/Qwen2.5-7B-Instruct

Text Generation • 8B • Updated Jan 12 • 6.89M • • 968

google/gemma-3-1b-it

Text Generation • 1.0B • Updated Apr 4 • 1.7M • 764

kakaocorp/kanana-2-30b-a3b-instruct

Text Generation • 31B • Updated 5 days ago • 386 • 17

Qwen/Qwen3-8B

Text Generation • 8B • Updated Jul 26 • 4.31M • • 824

Qwen/Qwen3-4B-Instruct-2507

Text Generation • 4B • Updated Sep 17 • 4.33M • • 577

nvidia/Nemotron-Orchestrator-8B

Text Generation • 8B • Updated 22 days ago • 59.6k • 465

meta-llama/Llama-3.2-1B

Text Generation • 1B • Updated Oct 24, 2024 • 1.62M • 2.23k

meta-llama/Llama-3.2-3B-Instruct

Text Generation • 3B • Updated Oct 24, 2024 • 1.97M • • 1.88k

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27 • 775k • • 12.9k

kakaocorp/kanana-2-30b-a3b-thinking

Text Generation • 31B • Updated 5 days ago • 605 • 13

Qwen/Qwen3-Embedding-8B

Feature Extraction • 8B • Updated Jul 7 • 948k • • 496

FutureMa/Qwen3-8B-Drama-Thinking

Text Generation • 308k • Updated about 7 hours ago • 2.12k • 87

t-tech/T-pro-it-2.1

Text Generation • 33B • Updated about 7 hours ago • 2 • 12

meta-llama/Llama-2-7b-hf

Text Generation • 7B • Updated Apr 17, 2024 • 558k • 2.23k

meta-llama/Meta-Llama-3-8B-Instruct

Text Generation • 8B • Updated Jun 18 • 1.51M • • 4.33k

meta-llama/Llama-3.1-8B

Text Generation • 8B • Updated Oct 16, 2024 • 725k • • 1.98k

Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • 8B • Updated Apr 6 • 2.72M • • 1.4k