Starred repositories

roboflow / notebooks

A collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge models like YOLO11, RT-DETR, SAM …

Jupyter Notebook 8,570 1,346 Updated Oct 3, 2025

Freika / dawarich

Self-hostable alternative to Google Timeline (Google Location History)

Ruby 6,967 209 Updated Oct 15, 2025

rashadphz / farfalle

🔍 AI search engine - self-host with local or cloud LLMs

TypeScript 3,466 323 Updated Sep 27, 2024

immich-app / immich

High performance self-hosted photo and video management solution.

TypeScript 81,270 4,272 Updated Oct 15, 2025

LAION-AI / natural_voice_assistant

Python 493 41 Updated May 27, 2024

metavoiceio / metavoice-src

Foundational model for human-like, expressive TTS

Python 4,188 694 Updated Jul 30, 2024

saagarjha / Ensemble

Cast Mac windows to visionOS

Swift 876 43 Updated Oct 8, 2025

LiheYoung / Depth-Anything

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Python 7,787 593 Updated Jul 17, 2024

apple / ml-aim

This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.

Python 1,375 67 Updated Aug 4, 2025

rhasspy / piper

A fast, local neural text to speech system

C++ 10,140 838 Updated Aug 26, 2025

WhisperSpeech / WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.

Jupyter Notebook 4,502 261 Updated Jun 8, 2025

danny-avila / LibreChat

Enhanced ChatGPT Clone: Features Agents, MCP, DeepSeek, Anthropic, AWS, OpenAI, Responses API, Azure, Groq, o1, GPT-5, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message…

TypeScript 30,777 5,923 Updated Oct 15, 2025

myshell-ai / OpenVoice

Instant voice cloning by MIT and MyShell. Audio foundation model.

Python 34,639 3,823 Updated Apr 19, 2025

PKU-YuanGroup / Video-LLaVA

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python 3,370 243 Updated Dec 3, 2024

yl4579 / StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 5,997 625 Updated Aug 10, 2024

OneInterface / realtime-bakllava

llama.cpp with BakLLaVA model describes what does it see

Python 382 41 Updated Nov 8, 2023

microsoft / autogen

A programming framework for agentic AI

Python 50,778 7,756 Updated Oct 8, 2025

zai-org / CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 6,675 442 Updated May 29, 2024

AILab-CVC / VideoCrafter

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Python 4,970 391 Updated Jul 10, 2024

continuedev / continue

⏩ Ship faster with Continuous AI. Build and run custom agents across your IDE, terminal, and CI

TypeScript 29,336 3,633 Updated Oct 15, 2025

smallcloudai / refact

AI Agent that handles engineering tasks end-to-end: integrates with developers’ tools, plans, executes, and iterates until it achieves a successful result.

Rust 3,337 287 Updated Sep 23, 2025

danielgross / localpilot

Python 3,379 146 Updated Feb 25, 2024

jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,770 570 Updated May 3, 2024

Deci-AI / super-gradients

Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS.

Jupyter Notebook 4,927 566 Updated Sep 17, 2024

aras-p / UnityGaussianSplatting

Toy Gaussian Splatting visualization in Unity

C# 2,821 365 Updated Aug 5, 2025

graphdeco-inria / gaussian-splatting

Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"

Python 18,936 2,663 Updated Oct 30, 2024

m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 18,169 1,920 Updated Oct 15, 2025

Plachtaa / VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,929 790 Updated Feb 11, 2024

kingyiusuen / clip-image-search

Search images with a text or image query, using Open AI's pretrained CLIP model.

Python 257 24 Updated Jan 15, 2022

facebookresearch / seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 11,677 1,160 Updated Nov 14, 2024

Josh Leverette coder543

Highlights

Starred repositories

Go