A 32B experimental reasoning model for advanced text generation and robust instruction following. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>
- Updated
Mar 11, 2025 - Python
A 32B experimental reasoning model for advanced text generation and robust instruction following. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>
Smaug-72B topped the Hugging Face LLM leaderboard and it’s the first model with an average score of 80, making it the world’s best open-source foundation model. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>
A distilled DeepSeek-R1 variant built on Qwen2.5-32B, fine-tuned with curated data for enhanced performance and efficiency. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>
A variant of the BART model designed specifically for natural language summarization. It was pre-trained on a large corpus of English text and later fine-tuned on the CNN/Daily Mail dataset. <metadata> gpu: T4 | collections: ["HF Transformers"] </metadata>
A GPTQ‑quantized version of Eric Hartford’s Dolphin 2.5 Mixtral 8x7B model, fine‑tuned for coding and conversational tasks. <metadata> gpu: A100 | collections: ["vLLM","GPTQ"] </metadata>
A 7B autoregressive language model by Mistral AI, optimized for efficient text generation and robust reasoning. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>
A GPTQ-quantized variant of the Mixtral 8x7B model, fine-tuned for efficient text generation and conversational applications. <metadata> gpu: A100 | collections: ["vLLM","GPTQ"] </metadata>
A quantized model fine-tuned for rapid, efficient, and robust conversational and instruction tasks. <metadata> gpu: A100 | collections: ["vLLM","AWQ"] </metadata>
Deploy GGUF quantized version of Tinyllama-1.1B GGUF vLLM for efficient inference. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>
2B instruct-tuned model for delivering coherent and instruction-following responses across a wide range of tasks. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>
A robust 8B parameter base model for diverse language tasks, offering strong performance in multilingual scenarios. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>
35B model delivering high performance in reasoning, summarization, and question answering. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>
A medical LLM built on LLaMA-3.1-70B, employing detailed step-by-step reasoning for complex medical problem-solving. <metadata> gpu: A100 | collections: ["HF Transformers","Variable Inputs"] </metadata>
Shopify AI blogger. Generate blog posts with ChatGPT!
An 7B model with a 32k token context window and optimized attention mechanisms for superior dialogue and reasoning. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>
An 8B-parameter, instruction-tuned variant of Meta's Llama-3.1 model, optimized in GGUF format for efficient inference. <metadata> gpu: A100 | collections: ["lama.cpp"] </metadata>
3B compact instruction-tuned model generate detailed responses across a range of tasks. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>
Advanced multimodal language model developed by Mistral AI with enhanced text performance, robust vision capabilities, and an expanded context window of up to 128,000 tokens. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>
A 7B instruction-tuned language model that excels in following detailed prompts and effectively performing a wide variety of natural language processing tasks. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>
A 10.7B language model by UpStage, fine-tuned for advanced text generation, precise instruction-following, and diverse NLP applications, delivering remarkably robust performance across creative and enterprise tasks at scale. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>
Add a description, image, and links to the generate-text topic page so that developers can more easily learn about it.
To associate your repository with the generate-text topic, visit your repo's landing page and select "manage topics."