InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →
Top 23 Python Embedding Projects
-
-
Stream
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
-
h2ogpt
Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
- Project mention: The AI-Native GraphDB + GraphRAG + Graph Memory Landscape & Market Catalog | dev.to | 2025-10-26
GitHub: https://github.com/neuml/txtai
- Project mention: BGE-Reasoner: An open-source framework for reasoning-intensive retrieval | news.ycombinator.com | 2025-08-27
-
pytorch-metric-learning
The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.
-
AutoRAG
AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
-
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
-
towhee
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
-
prompttools
Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).
-
-
-
- Project mention: One Embedder, Any Task: Instruction-Finetuned Text Embeddings | news.ycombinator.com | 2025-10-08
- Project mention: EmbeddingGemma: The Best-in-Class Open Model for On-Device Embedding | news.ycombinator.com | 2025-09-04
Can anyone test it through model2vec?
https://github.com/MinishLab/model2vec
-
GPTDiscord
A robust, all-in-one GPT interface for Discord. ChatGPT-style conversations, image generation, AI-moderation, custom indexes/knowledgebase, youtube summarizer, and more!
-
-
-
-
-
contextualized-topic-models
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).
-
-
lightly-train
All-in-one training for vision models (YOLO, ViTs, RT-DETR, DINOv3): pretraining, fine-tuning, distillation.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python Embeddings discussion
Python Embeddings related posts
-
Introducing Nano Banana Pro: Complete Developer Tutorial
-
How to Build a RAG Solution with Llama Index, ChromaDB, and Ollama
-
One Embedder, Any Task: Instruction-Finetuned Text Embeddings
-
Translating Cython to Mojo, a first attempt
-
Token Counting Meets Amazon Bedrock
-
BGE-Reasoner: An open-source framework for reasoning-intensive retrieval
-
Show HN: Distill DINOv3 into your own model
- A note from our sponsor - InfluxDB www.influxdata.com | 22 Dec 2025
Index
What are some of the best open-source Embedding projects in Python? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | mem0 | 44,522 |
| 2 | h2ogpt | 11,976 |
| 3 | txtai | 11,949 |
| 4 | FlagEmbedding | 11,024 |
| 5 | pytorch-metric-learning | 6,282 |
| 6 | AutoRAG | 4,485 |
| 7 | lightly | 3,650 |
| 8 | hub | 3,523 |
| 9 | towhee | 3,426 |
| 10 | prompttools | 2,956 |
| 11 | datachain | 2,716 |
| 12 | fastembed | 2,573 |
| 13 | ailia-models | 2,296 |
| 14 | instructor-embedding | 2,021 |
| 15 | model2vec | 1,957 |
| 16 | GPTDiscord | 1,853 |
| 17 | magnitude | 1,652 |
| 18 | eda_nlp | 1,649 |
| 19 | ModernBERT | 1,594 |
| 20 | hazm | 1,332 |
| 21 | contextualized-topic-models | 1,254 |
| 22 | SeaGOAT | 1,238 |
| 23 | lightly-train | 1,178 |