Dingo: A Comprehensive AI Data, Model and Application Quality Evaluation Tool
- Updated
Dec 11, 2025 - JavaScript
Dingo: A Comprehensive AI Data, Model and Application Quality Evaluation Tool
Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights
[AAAI 2025] From 2D CAD Drawings to 3D Parametric Models: A Vision-Language Approach
An implementation for MLLM oversensitivity evaluation
#3 Winner of Best Use of Zoom API at Stanford TreeHacks 2025! An AI-powered meeting assistant that captures video, audio and textual context from Zoom calls using multimodal RAG.
[NVIDIA ONLY] [RTX 50 Support] Image generation, image editing and free-form manipulation with a VLM (Minimum Requirements 12GB VRAM / 32GB RAM Recommended Requirements 24GB VRAM / 48GB RAM)
RealTime-VLM brings real-time VLM inference to the browser. It continuously captures webcam frames, sends image+text to an OpenAI-compatible API, and displays responses with sub-second latency. Works with local or hosted VLMs.
Real-time video captioning powered by FastVLM
This is the official implementation (code, data) of the paper "MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries?""
[NVIDIA ONLY] Image generation, image editing and free-form manipulation with a VLM (Minimum Requirements 12GB VRAM / 32GB RAM Recommended Requirements 24GB VRAM / 48GB RAM)
AI-powered personal assistant with OpenAI & MERN stack. Chat, analyze images, secure subs—50% faster responses.
[NVIDIA ONLY] Gradio demo for Flux Kontext based on Diffusers with single and multiple images. (Minimum Requirements 12GB VRAM 48GB RAM / Recommended Requirements 24GB VRAM / 48GB RAM)
ScribblAI turns your chaotic doodles into photorealistic images using advanced AI — and your friends must guess what you were trying to draw. It's fast, fun, and full of AI magic!
We introduce the YesBut-v2, a benchmark for assessing AI's ability to interpret juxtaposed comic panels with contradictory narratives. Unlike existing benchmarks, it emphasizes visual understanding, comparative reasoning, and social knowledge.
YesBut Benchmark; Project page of paper Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions, accepted by NeurIPS 2024 (Oral).
Add a description, image, and links to the vlm topic page so that developers can more easily learn about it.
To associate your repository with the vlm topic, visit your repo's landing page and select "manage topics."