Python rag

Open-source Python projects categorized as rag

Top 23 Python rag Projects

  1. awesome-llm-apps

    Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.

    Project mention: Show HN: AI Real Estate Agent Team (100% Open-Source with Free Tutorial) | news.ycombinator.com | 2025-08-06
  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. ragflow

    RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs

    Project mention: The AI-Native GraphDB + GraphRAG + Graph Memory Landscape & Market Catalog | dev.to | 2025-10-26
  4. llama_index

    LlamaIndex is the leading framework for building LLM-powered agents over your data.

    Project mention: How to Build a RAG Solution with Llama Index, ChromaDB, and Ollama | dev.to | 2025-11-04

    Step 2: Set up LlamaIndex and Chroma DB

  5. mem0

    Universal memory layer for AI Agents

    Project mention: Write an Agent | news.ycombinator.com | 2025-11-06
  6. chatgpt-on-wechat

    基于大模型搭建的聊天机器人,同时支持 微信公众号、企业微信应用、飞书、钉钉 等接入,可选择ChatGPT/Claude/DeepSeek/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Kimi/LinkAI,能处理文本、语音和图片,访问操作系统和互联网,支持基于自有知识库进行定制企业智能客服。

  7. quivr

    Opiniated RAG for integrating GenAI in your apps 🧠 Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: PGVector, Faiss. Any Files. Anyway you want.

  8. MindsDB

    Query Engine for AI - The only MCP Server you'll ever need

    Project mention: “One Journey Ends, Another Begins — My Hacktoberfest 2025 Story” | dev.to | 2025-10-31

    Just wrapped up my Hacktoberfest project using MindsDB and Streamlit — built a CRM Semantic Search AI app! 😄 If anyone’s into open source + AI, would love feedback on my PR: Hacktoberfest 2025 PR – Add CRM Semantic Search use case (MindsDB)

  9. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  10. khoj

    Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

  11. graphrag

    A modular graph-based Retrieval-Augmented Generation (RAG) system

    Project mention: The AI-Native GraphDB + GraphRAG + Graph Memory Landscape & Market Catalog | dev.to | 2025-10-26

    URL: https://microsoft.github.io/graphrag/ and https://github.com/microsoft/graphrag and https://github.com/Azure-Samples/graphrag-accelerator

  12. LightRAG

    [EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"

    Project mention: 💻 Unlock RAG-Anything’s Power with Ollama on Your Machine (with Docling as Bonus) | dev.to | 2025-12-01

    Modern documents increasingly contain diverse multimodal content — text, images, tables, equations, charts, and multimedia — that traditional text-focused RAG systems cannot effectively process. RAG-Anything addresses this challenge as a comprehensive All-in-One Multimodal Document Processing RAG system built on LightRAG.

  13. kotaemon

    An open-source RAG-based tool for chatting with your documents.

    Project mention: Kotaemon-papers: an open-source web app to chat with your academic papers | news.ycombinator.com | 2025-01-05

    Hi HN,

    Our team at https://github.com/Cinnamon/kotaemon/ has been working on a public demo to showcase the new advanced citation features in our RAG (retrieval-augmented generation) application.

    We’re excited to share a web app that lets users explore top daily machine learning (ML) papers on Arxiv (via the HuggingFace API) and upload their own Arxiv papers to get LLM-assisted summaries, mind maps, and answers to questions based on the content.

    Some notable features:

    - Instant Summaries & Mind Maps: Generate concise summaries and visual mind maps for any Arxiv paper.

    - Transparent Citations: Verify AI-generated answers with clear, evidence-backed citations. Citations are highlighted directly in the in-browser PDF viewer.

    - Flexible Citation Options: Choose between highlights and inline citations. Plus, click on any sentence in the AI-generated response to see its supporting source from the original paper.

    - Multi-Paper Analysis: Compare, contrast, and compose summaries from multiple papers simultaneously.

    - Complex Question Solving: Use Chain-of-Thought (CoT) reasoning mode to break down and solve complex questions step-by-step.

    - Customizable & Private Hosting: Easily self-host or customize your private app via HuggingFace Spaces. You can securely connect your LLM and upload your own document collections.

    We’d love to hear your thoughts, feedback, and recommendations as we continue improving this tool.

    Check out the demo here and happy hacking!

  14. Scrapegraph-ai

    Python scraper based on AI

    Project mention: ScrapeGraphAI Release Week | news.ycombinator.com | 2025-07-07
  15. vanna

    🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using Agentic Retrieval 🔄.

    Project mention: How to Use Vanna.ai to Query Your Database with Open-Source Language Models | dev.to | 2025-12-02

    Vanna.ai GitHub: https://github.com/vanna-ai/vanna

  16. graphiti

    Build Real-Time Knowledge Graphs for AI Agents

    Project mention: I built an faster Notion in Rust | news.ycombinator.com | 2025-11-24
  17. DB-GPT

    AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents

    Project mention: Launch HN: Gecko Security (YC F24) – AI That Finds Vulnerabilities in Code | news.ycombinator.com | 2025-08-01

    Yes, that's exactly what we do. Some examples: https://github.com/eosphoros-ai/DB-GPT/pull/2650, https://github.com/dagster-io/dagster/pull/30002

    We just need to follow responsible disclosure first by notifying the maintainers, working with them on a fix, and making it public once it is resolved.

  18. DocsGPT

    Private AI platform for agents, assistants and enterprise search. Built-in Agent Builder, Deep research, Document analysis, Multi-model support, and API connectivity for agents.

    Project mention: 15 AI tools that almost replace a full dev team but please don’t fire us yet | dev.to | 2025-05-03

    DocsGPT: Lets users query your docs using GPT.

  19. onyx

    Open Source AI Platform - AI Chat with advanced features that works with every LLM

    Project mention: Launch HN: Onyx (YC W24) – The open-source chat UI | news.ycombinator.com | 2025-11-25

    > ease of installation, streaming support, model agnosticity, chat persistence and blob support

    we have all of those!

    > how onyx is comparable

    For an AI-powered research assistant, Onyx might just work out of the box. We have ~45 connectors to common apps (https://github.com/onyx-dot-app/onyx/blob/main/backend/onyx/...), integrations with the most popular web search providers (https://github.com/onyx-dot-app/onyx/blob/main/backend/onyx/...), and a built in tool calling loop w/ deep research support (https://github.com/onyx-dot-app/onyx/blob/main/backend/onyx/...). If you wanted to customize, you could pretty easily tweak this / add additional tools (or even rip this out completely and build your own agent loop).

  20. txtai

    💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows

    Project mention: The AI-Native GraphDB + GraphRAG + Graph Memory Landscape & Market Catalog | dev.to | 2025-10-26

    GitHub: https://github.com/neuml/txtai

  21. Memori

    SQL Native Memory Layer for LLMs, AI Agents & Multi-Agent Systems

    Project mention: Daily AI & Automation Tech News - November 20, 2025 | dev.to | 2025-11-19

    Link: https://github.com/GibsonAI/Memori

  22. memvid

    Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.

    Project mention: Friday Links #30 — JavaScript Updates, Tools, and Inspiration | dev.to | 2025-10-17

    memvid - Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.

  23. cognee

    Memory for AI Agents in 6 lines of code

    Project mention: The AI-Native GraphDB + GraphRAG + Graph Memory Landscape & Market Catalog | dev.to | 2025-10-26

    URLs: https://github.com/topoteretes/cognee (hosted at cognee.ai / Cogwit)

  24. paper-qa

    High accuracy RAG for answering questions from scientific documents with citations

    Project mention: Show HN: Trieve CLI – Terminal-Based LLM Agent Loop with Search Tool for PDFs | news.ycombinator.com | 2025-06-18

    https://github.com/Future-House/paper-qa?tab=readme-ov-file#... :

    > PaperQA2 is engineered to be the best agentic RAG model for working with scientific papers.

    > [ Semantic Scholar, CrossRef, ]

    paperqa-zotero: https://github.com/lejacobroy/paperqa-zotero

    The Oracle of Zotero is a fork of paperqa-zotero fork FAISS and langchain:

  25. Upsonic

    Agent Framework For Fintech and Banks

    Project mention: An AI agent framework used by fintechs | news.ycombinator.com | 2025-11-22
  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python rag discussion

Python rag related posts

  • Show HN: Built a tool to allow chatting with SEC filings

    1 project | news.ycombinator.com | 15 Dec 2025
  • A customizable agentic AI toolkit for e-commerce

    1 project | news.ycombinator.com | 11 Dec 2025
  • So you wanna build a local RAG?

    5 projects | news.ycombinator.com | 28 Nov 2025
  • Launch HN: Onyx (YC W24) – The open-source chat UI

    5 projects | news.ycombinator.com | 25 Nov 2025
  • Cross-Modal Embeddings: Bridging AI Modalities

    6 projects | dev.to | 21 Nov 2025
  • The AI-Native GraphDB + GraphRAG + Graph Memory Landscape & Market Catalog

    54 projects | dev.to | 26 Oct 2025
  • AI-Powered Cover Letter Generator

    3 projects | dev.to | 24 Oct 2025
  • A note from our sponsor - Stream
    getstream.io | 22 Dec 2025
    Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure. Learn more →

Index

What are some of the best open-source rag projects in Python? This list will help you:

# Project Stars
1 awesome-llm-apps 83,772
2 ragflow 69,871
3 llama_index 45,854
4 mem0 44,522
5 chatgpt-on-wechat 40,145
6 quivr 38,685
7 MindsDB 38,107
8 khoj 31,962
9 graphrag 29,803
10 LightRAG 26,055
11 kotaemon 24,766
12 Scrapegraph-ai 22,007
13 vanna 21,958
14 graphiti 21,190
15 DB-GPT 17,819
16 DocsGPT 17,534
17 onyx 16,857
18 txtai 11,949
19 Memori 11,119
20 memvid 10,496
21 cognee 10,412
22 paper-qa 7,911
23 Upsonic 7,727

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?