Posted on Jul 11

Next-Gen Q&A: Retrieval-Augmented AI with Chroma Vector Store

A Retrieval-Augmented Generation (RAG) agent combines document retrieval with LLM-based response generation to provide intelligent, context-aware answers. In this guide, you’ll build a RAG system using LangChain, ChromaDB, and OpenAI or HuggingFace.

🛠️ Tech Stack
Python

LangChain

ChromaDB

OpenAI or HuggingFace LLMs

SentenceTransformers (all-MiniLM-L6-v2)

📦 Install Dependencies

pip install langchain chromadb sentence-transformers openai

🧱 Folder Structure

. ├── rag_chroma_db/ # Chroma vector store ├── docs/ │ └── my_corpus.txt # Your source document └── rag_agent.py # Main script

📄 Code: RAG Agent with ChromaDB

from langchain.embeddings import SentenceTransformerEmbeddings from langchain.vectorstores import Chroma from langchain.text_splitter import RecursiveCharacterTextSplitter from langchain.document_loaders import TextLoader from langchain.chains import RetrievalQA from langchain.llms import OpenAI # You can also use HuggingFaceHub  # 1. Load documents loader = TextLoader("docs/my_corpus.txt") documents = loader.load() # 2. Split into chunks text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50) chunks = text_splitter.split_documents(documents) # 3. Embed and store in Chroma embedding = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2") vectordb = Chroma.from_documents(documents=chunks, embedding=embedding, persist_directory="rag_chroma_db") vectordb.persist() # 4. Set up retriever retriever = vectordb.as_retriever(search_kwargs={"k": 3}) # 5. Set up LLM llm = OpenAI(temperature=0) # 6. Create RAG chain qa = RetrievalQA.from_chain_type(llm=llm, retriever=retriever, return_source_documents=True) # 7. Ask questions query = "What is the main topic of the document?" result = qa({"query": query}) print("Answer:", result["result"]) print("Sources:", result["source_documents"])

🔐 Set Your API Key
Make sure your environment is set with the OpenAI key:

export OPENAI_API_KEY="your-api-key"

Or in Python:

import os os.environ["OPENAI_API_KEY"] = "your-api-key"

🔄 What’s Next?
📄 Add PDF loader with PyMuPDF or pdfminer.six

🖥️ Add a UI with Streamlit or FastAPI

🤖 Wrap the retriever as a LangChain Tool + Agent

🔌 Run offline using HuggingFace LLMs

💡 Summary
You now have a working Retrieval-Augmented Generation (RAG) agent using:

A local document chunked + embedded with SentenceTransformers

Stored in ChromaDB vector store

Queried using LangChain RetrievalQA

Answered using OpenAI GPT

Top comments (2)

Aiden Benjamin • Jul 23

Super clean integration of Chroma! This makes RAG pipelines much more manageable and fast to deploy.

Lucas Henry • Jul 23

The real-world use case you mentioned gave me some great ideas for internal enterprise tools.