[CVPR 2025] HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation
- Updated
Jul 8, 2025 - Python
[CVPR 2025] HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation
Easy text classification for everyone : Bert based models via Huggingface transformers (KR / EN)
Upload & Merge CSV or JSON Data with Images to Notion Database
ScaleDP is an Open-Source extension of Apache Spark for Document Processing
Simple Generative AI enabled Streamlit web application that converts speech to-image.
Multimodal Document Processing RAG with LangChain
This repository contains code for generating blog content using the LLama 2 language model. It integrates with Streamlit for easy user interaction. Simply input your blog topic, desired word count, and writing style to generate engaging blog content.
🤗 A Python script for efficiently downloading and reconstructing large Hugging Face model files by splitting them into manageable chunks
A URL summarizer, which summarizes the content of a URL with proper formatting. It uses 'sshleifer/distilbart-cnn-12-6', which is a distilled version of the BART model, specifically optimized for text summarization tasks, including CNN summarization.
一个从 Hugging Face 镜像站点快速下载模型和数据集的命令行工具。
UniversalLLMAdapter class initializes the appropriate client based on the specified provider.
An AI powered CLI tool that can help you organize and make projects the fastest way possible.
VLM-Parsing is a Gradio-based web application for parsing documents and images into structured HTML and Markdown formats using advanced Vision Language Models (VLMs).
A fine-tuned version of SmolLM2-360M-Instruct-bnb-4bit specialized for parsing unstructured calendar event requests into structured JSON data.
Multimodal-OCR3 is an advanced Optical Character Recognition (OCR) application that leverages multiple state-of-the-art multimodal models to extract text from images.
TranslateAI is a powerful real-time speech translation desktop application built using PyQt and Hugging Face models. It enables users to convert spoken words into text and translate them into different languages.
Age-Classification-SigLIP2 is an image classification vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for a single-label classification task. It is designed to predict the age group of a person from an image using the SiglipForImageClassification architecture.
🤗 Prueba de concepto de un chat interactivo construido con Flask y Tailwind CSS. Utiliza smolagents para generar respuestas a preguntas. Incluye formateo básico de texto y una interfaz minimalista. Proyecto en desarrollo con fines de prueba y aprendizaje.
Professional-grade cryptocurrency analysis with advanced AI/ML predictions, 50+ pattern recognition, and MathPlotLib terminal charts. CryptVault is an informational tool for educational and research purposes only.
multiple anime image classification / anime image checker
Add a description, image, and links to the huggingface-models topic page so that developers can more easily learn about it.
To associate your repository with the huggingface-models topic, visit your repo's landing page and select "manage topics."