Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure. Learn more →
Top 23 Python rag Projects
-
awesome-llm-apps
Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.
Project mention: Show HN: AI Real Estate Agent Team (100% Open-Source with Free Tutorial) | news.ycombinator.com | 2025-08-06 -
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
ragflow
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
Project mention: The AI-Native GraphDB + GraphRAG + Graph Memory Landscape & Market Catalog | dev.to | 2025-10-26 - Project mention: How to Build a RAG Solution with Llama Index, ChromaDB, and Ollama | dev.to | 2025-11-04
Step 2: Set up LlamaIndex and Chroma DB
-
-
chatgpt-on-wechat
基于大模型搭建的聊天机器人,同时支持 微信公众号、企业微信应用、飞书、钉钉 等接入,可选择ChatGPT/Claude/DeepSeek/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Kimi/LinkAI,能处理文本、语音和图片,访问操作系统和互联网,支持基于自有知识库进行定制企业智能客服。
-
quivr
Opiniated RAG for integrating GenAI in your apps 🧠 Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: PGVector, Faiss. Any Files. Anyway you want.
- Project mention: “One Journey Ends, Another Begins — My Hacktoberfest 2025 Story” | dev.to | 2025-10-31
Just wrapped up my Hacktoberfest project using MindsDB and Streamlit — built a CRM Semantic Search AI app! 😄 If anyone’s into open source + AI, would love feedback on my PR: Hacktoberfest 2025 PR – Add CRM Semantic Search use case (MindsDB)
-
Stream
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
-
khoj
Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.
- Project mention: The AI-Native GraphDB + GraphRAG + Graph Memory Landscape & Market Catalog | dev.to | 2025-10-26
URL: https://microsoft.github.io/graphrag/ and https://github.com/microsoft/graphrag and https://github.com/Azure-Samples/graphrag-accelerator
- Project mention: 💻 Unlock RAG-Anything’s Power with Ollama on Your Machine (with Docling as Bonus) | dev.to | 2025-12-01
Modern documents increasingly contain diverse multimodal content — text, images, tables, equations, charts, and multimedia — that traditional text-focused RAG systems cannot effectively process. RAG-Anything addresses this challenge as a comprehensive All-in-One Multimodal Document Processing RAG system built on LightRAG.
- Project mention: Kotaemon-papers: an open-source web app to chat with your academic papers | news.ycombinator.com | 2025-01-05
Hi HN,
Our team at https://github.com/Cinnamon/kotaemon/ has been working on a public demo to showcase the new advanced citation features in our RAG (retrieval-augmented generation) application.
We’re excited to share a web app that lets users explore top daily machine learning (ML) papers on Arxiv (via the HuggingFace API) and upload their own Arxiv papers to get LLM-assisted summaries, mind maps, and answers to questions based on the content.
Some notable features:
- Instant Summaries & Mind Maps: Generate concise summaries and visual mind maps for any Arxiv paper.
- Transparent Citations: Verify AI-generated answers with clear, evidence-backed citations. Citations are highlighted directly in the in-browser PDF viewer.
- Flexible Citation Options: Choose between highlights and inline citations. Plus, click on any sentence in the AI-generated response to see its supporting source from the original paper.
- Multi-Paper Analysis: Compare, contrast, and compose summaries from multiple papers simultaneously.
- Complex Question Solving: Use Chain-of-Thought (CoT) reasoning mode to break down and solve complex questions step-by-step.
- Customizable & Private Hosting: Easily self-host or customize your private app via HuggingFace Spaces. You can securely connect your LLM and upload your own document collections.
We’d love to hear your thoughts, feedback, and recommendations as we continue improving this tool.
Check out the demo here and happy hacking!
-
-
vanna
🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using Agentic Retrieval 🔄.
Project mention: How to Use Vanna.ai to Query Your Database with Open-Source Language Models | dev.to | 2025-12-02Vanna.ai GitHub: https://github.com/vanna-ai/vanna
-
-
DB-GPT
AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
Project mention: Launch HN: Gecko Security (YC F24) – AI That Finds Vulnerabilities in Code | news.ycombinator.com | 2025-08-01Yes, that's exactly what we do. Some examples: https://github.com/eosphoros-ai/DB-GPT/pull/2650, https://github.com/dagster-io/dagster/pull/30002
We just need to follow responsible disclosure first by notifying the maintainers, working with them on a fix, and making it public once it is resolved.
-
DocsGPT
Private AI platform for agents, assistants and enterprise search. Built-in Agent Builder, Deep research, Document analysis, Multi-model support, and API connectivity for agents.
Project mention: 15 AI tools that almost replace a full dev team but please don’t fire us yet | dev.to | 2025-05-03DocsGPT: Lets users query your docs using GPT.
- Project mention: Launch HN: Onyx (YC W24) – The open-source chat UI | news.ycombinator.com | 2025-11-25
> ease of installation, streaming support, model agnosticity, chat persistence and blob support
we have all of those!
> how onyx is comparable
For an AI-powered research assistant, Onyx might just work out of the box. We have ~45 connectors to common apps (https://github.com/onyx-dot-app/onyx/blob/main/backend/onyx/...), integrations with the most popular web search providers (https://github.com/onyx-dot-app/onyx/blob/main/backend/onyx/...), and a built in tool calling loop w/ deep research support (https://github.com/onyx-dot-app/onyx/blob/main/backend/onyx/...). If you wanted to customize, you could pretty easily tweak this / add additional tools (or even rip this out completely and build your own agent loop).
- Project mention: The AI-Native GraphDB + GraphRAG + Graph Memory Landscape & Market Catalog | dev.to | 2025-10-26
GitHub: https://github.com/neuml/txtai
-
Link: https://github.com/GibsonAI/Memori
-
memvid
Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.
Project mention: Friday Links #30 — JavaScript Updates, Tools, and Inspiration | dev.to | 2025-10-17memvid - Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.
- Project mention: The AI-Native GraphDB + GraphRAG + Graph Memory Landscape & Market Catalog | dev.to | 2025-10-26
URLs: https://github.com/topoteretes/cognee (hosted at cognee.ai / Cogwit)
- Project mention: Show HN: Trieve CLI – Terminal-Based LLM Agent Loop with Search Tool for PDFs | news.ycombinator.com | 2025-06-18
https://github.com/Future-House/paper-qa?tab=readme-ov-file#... :
> PaperQA2 is engineered to be the best agentic RAG model for working with scientific papers.
> [ Semantic Scholar, CrossRef, ]
paperqa-zotero: https://github.com/lejacobroy/paperqa-zotero
The Oracle of Zotero is a fork of paperqa-zotero fork FAISS and langchain:
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python rag discussion
Python rag related posts
-
Show HN: Built a tool to allow chatting with SEC filings
-
A customizable agentic AI toolkit for e-commerce
-
So you wanna build a local RAG?
-
Launch HN: Onyx (YC W24) – The open-source chat UI
-
Cross-Modal Embeddings: Bridging AI Modalities
-
The AI-Native GraphDB + GraphRAG + Graph Memory Landscape & Market Catalog
-
AI-Powered Cover Letter Generator
- A note from our sponsor - Stream getstream.io | 22 Dec 2025
Index
What are some of the best open-source rag projects in Python? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | awesome-llm-apps | 83,772 |
| 2 | ragflow | 69,871 |
| 3 | llama_index | 45,854 |
| 4 | mem0 | 44,522 |
| 5 | chatgpt-on-wechat | 40,145 |
| 6 | quivr | 38,685 |
| 7 | MindsDB | 38,107 |
| 8 | khoj | 31,962 |
| 9 | graphrag | 29,803 |
| 10 | LightRAG | 26,055 |
| 11 | kotaemon | 24,766 |
| 12 | Scrapegraph-ai | 22,007 |
| 13 | vanna | 21,958 |
| 14 | graphiti | 21,190 |
| 15 | DB-GPT | 17,819 |
| 16 | DocsGPT | 17,534 |
| 17 | onyx | 16,857 |
| 18 | txtai | 11,949 |
| 19 | Memori | 11,119 |
| 20 | memvid | 10,496 |
| 21 | cognee | 10,412 |
| 22 | paper-qa | 7,911 |
| 23 | Upsonic | 7,727 |