DEV Community

Cover image for Next-Gen Q&A: Retrieval-Augmented AI with Chroma Vector Store
Chandrani Mukherjee
Chandrani Mukherjee

Posted on

Next-Gen Q&A: Retrieval-Augmented AI with Chroma Vector Store

A Retrieval-Augmented Generation (RAG) agent combines document retrieval with LLM-based response generation to provide intelligent, context-aware answers. In this guide, youโ€™ll build a RAG system using LangChain, ChromaDB, and OpenAI or HuggingFace.

๐Ÿ› ๏ธ Tech Stack
Python

LangChain

ChromaDB

OpenAI or HuggingFace LLMs

SentenceTransformers (all-MiniLM-L6-v2)

๐Ÿ“ฆ Install Dependencies

pip install langchain chromadb sentence-transformers openai 
Enter fullscreen mode Exit fullscreen mode

๐Ÿงฑ Folder Structure

. โ”œโ”€โ”€ rag_chroma_db/ # Chroma vector store โ”œโ”€โ”€ docs/ โ”‚ โ””โ”€โ”€ my_corpus.txt # Your source document โ””โ”€โ”€ rag_agent.py # Main script 
Enter fullscreen mode Exit fullscreen mode

๐Ÿ“„ Code: RAG Agent with ChromaDB

from langchain.embeddings import SentenceTransformerEmbeddings from langchain.vectorstores import Chroma from langchain.text_splitter import RecursiveCharacterTextSplitter from langchain.document_loaders import TextLoader from langchain.chains import RetrievalQA from langchain.llms import OpenAI # You can also use HuggingFaceHub  # 1. Load documents loader = TextLoader("docs/my_corpus.txt") documents = loader.load() # 2. Split into chunks text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50) chunks = text_splitter.split_documents(documents) # 3. Embed and store in Chroma embedding = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2") vectordb = Chroma.from_documents(documents=chunks, embedding=embedding, persist_directory="rag_chroma_db") vectordb.persist() # 4. Set up retriever retriever = vectordb.as_retriever(search_kwargs={"k": 3}) # 5. Set up LLM llm = OpenAI(temperature=0) # 6. Create RAG chain qa = RetrievalQA.from_chain_type(llm=llm, retriever=retriever, return_source_documents=True) # 7. Ask questions query = "What is the main topic of the document?" result = qa({"query": query}) print("Answer:", result["result"]) print("Sources:", result["source_documents"]) 
Enter fullscreen mode Exit fullscreen mode

๐Ÿ” Set Your API Key
Make sure your environment is set with the OpenAI key:

export OPENAI_API_KEY="your-api-key" 
Enter fullscreen mode Exit fullscreen mode

Or in Python:

import os os.environ["OPENAI_API_KEY"] = "your-api-key" 
Enter fullscreen mode Exit fullscreen mode

๐Ÿ”„ Whatโ€™s Next?
๐Ÿ“„ Add PDF loader with PyMuPDF or pdfminer.six

๐Ÿ–ฅ๏ธ Add a UI with Streamlit or FastAPI

๐Ÿค– Wrap the retriever as a LangChain Tool + Agent

๐Ÿ”Œ Run offline using HuggingFace LLMs

๐Ÿ’ก Summary
You now have a working Retrieval-Augmented Generation (RAG) agent using:

A local document chunked + embedded with SentenceTransformers

Stored in ChromaDB vector store

Queried using LangChain RetrievalQA

Answered using OpenAI GPT

Top comments (2)

Collapse
 
aiden_benjamin_52c3f6771f profile image
Aiden Benjamin

Super clean integration of Chroma! This makes RAG pipelines much more manageable and fast to deploy.

Collapse
 
lucas_henry_57e7d4ec16689 profile image
Lucas Henry

The real-world use case you mentioned gave me some great ideas for internal enterprise tools.