This document provides a high-level introduction to the rag-pipelines repository, a framework for building domain-specific Retrieval-Augmented Generation (RAG) systems. This overview covers the repository's purpose, standardized architecture, core components, and available domain-specific implementations.
For detailed information on specific topics:
Sources: README.md1-147
The rag-pipelines repository is a production-grade framework for implementing Retrieval-Augmented Generation systems tailored to specific knowledge domains. Unlike general-purpose RAG solutions, this framework emphasizes:
The repository currently implements six production pipelines across two domains:
Sources: README.md13-35 pyproject.toml1-10
The following diagram maps the repository's physical structure to its logical components:
The repository follows a clear separation of concerns:
src/rag_pipelines/utils/: Reusable core utilities imported by all pipelinesbaml_client/ and baml_src/: Auto-generated BAML workflow code (excluded from linting and type checking per pyproject.toml90-94)Sources: README.md1-147 pyproject.toml1-166 .gitignore1-208
All pipelines in this repository follow three fundamental architectural patterns:
Each pipeline operates in two distinct phases:
*_indexing.py scripts, configured by *_indexing_config.yml*_rag.py scripts, configured by *_rag_config.ymlFor detailed phase specifications, see Two-Phase RAG Pattern.
Sources: README.md36-51
Phase 2 (RAG Execution) is implemented as a StateGraph from the langgraph library. Each pipeline defines:
RAGState: A TypedDict containing question, context, retrieved documents, metadata, answer, and evaluation resultsRAGState, implementing the five standard nodes shown aboveFor LangGraph implementation details, see LangGraph State Workflows and RAGState and Pipeline Nodes.
Sources: README.md17 pyproject.toml16
All pipeline customization occurs via YAML configuration files, not code modification:
| Configuration File | Purpose | Key Sections |
|---|---|---|
*_indexing_config.yml | Controls Phase 1 indexing process | data_loader, chunker, metadata_extractor, embeddings, vector_store |
*_rag_config.yml | Controls Phase 2 RAG execution | metadata_extractor, retriever, reranker, llm, evaluation_metrics |
For configuration schema and usage, see Configuration Management.
Sources: README.md23
The src/rag_pipelines/utils/ directory provides three critical utilities used by all pipelines:
| Component | Class Name | Primary Purpose | Key Methods |
|---|---|---|---|
| Metadata Extraction | MetadataExtractor | LLM-powered structured metadata extraction from text | extract_metadata() |
| Contextual Reranking | ContextualReranker | Relevance-based document reordering using transformer models | rerank() |
| Document Loading | UnstructuredDocumentLoader, UnstructuredAPIDocumentLoader | PDF/document parsing via Unstructured API | load(), lazy_load() |
| Document Chunking | UnstructuredChunker | Strategy-based document chunking (basic, by_title) | chunk_documents() |
The MetadataExtractor class in src/rag_pipelines/utils/metadata_extractor.py converts user-provided JSON schemas into Pydantic models, then uses LLM structured output to extract metadata conforming to that schema. Only successfully extracted (non-null) fields appear in the result dictionary.
For detailed API documentation, see Metadata Extraction.
Sources: README.md52-84
The ContextualReranker class in src/rag_pipelines/utils/contextual_reranker.py uses Contextual AI's instruction-following reranker models to score and reorder retrieved documents based on query relevance. Supports custom instructions for domain-specific reranking behavior.
For detailed API documentation, see Contextual Reranking.
Sources: README.md54-61
The following table lists core dependencies with their roles and version constraints from pyproject.toml13-39:
| Category | Package | Version Constraint | Purpose |
|---|---|---|---|
| Orchestration | langgraph | Latest | State machine workflow orchestration |
| LangChain Core | langchain-core | Latest | Base abstractions for LLM applications |
| LLM Integration | langchain-groq | Latest | Groq API client for LLM inference |
| Embeddings | langchain-huggingface, sentence-transformers | Latest | Dense vector embeddings |
| Vector Database | langchain-milvus | Latest | Milvus vector store integration |
| Document Processing | langchain-unstructured, unstructured[pdf] | Latest | PDF parsing and chunking |
| Evaluation | deepeval | >=3.7.0 | RAG evaluation metrics |
| Workflow DSL | baml-py | ==0.214.0 | BAML declarative workflows |
| Data Loading | datasets | Latest | HuggingFace datasets |
| Financial Data | edgartools | Latest | SEC EDGAR filings |
Environment variables required for external service authentication are documented in README.md109-116 For setup instructions, see Getting Started.
Sources: pyproject.toml13-39 README.md109-116
The repository provides six production-ready RAG pipelines organized by domain:
| Pipeline | Module Path | Dataset Source | Primary Use Case | Milvus Collection |
|---|---|---|---|---|
| HealthBench | src/rag_pipelines/healthbench/ | Tonic/Health-Bench (HuggingFace) | Multi-turn medical conversations with expert rubric evaluation | healthbench |
| MedCaseReasoning | src/rag_pipelines/medcasereasoning/ | Medical case studies | Clinical case analysis and diagnostic reasoning | medcasereasoning |
| MetaMedQA | src/rag_pipelines/metamedqa/ | qiaojin/MetaMedQA (HuggingFace) | USMLE medical exam preparation and medical textbook QA | metamedqa |
| PubMedQA | src/rag_pipelines/pubmedqa/ | qiaojin/PubMedQA (HuggingFace) | Biomedical research questions from PubMed articles | pubmedqa |
For detailed medical pipeline documentation, see Medical Domain Pipelines.
Sources: README.md29-32
| Pipeline | Module Path | Dataset Source | Primary Use Case | Milvus Collection |
|---|---|---|---|---|
| FinanceBench | src/rag_pipelines/financebench/ | patronus-ai/financebench (GitHub) | SEC filings QA (10-K, 10-Q, 8-K) | financebench |
| Earnings Calls | src/rag_pipelines/earnings_calls/ | lamini/earnings-calls-qa (HuggingFace) | Earnings call transcript analysis for 2800+ companies | earnings_calls |
For detailed financial pipeline documentation, see Financial Domain Pipelines.
Sources: README.md33-34
All six pipelines implement the same code structure:
<pipeline_name>/ ├── <pipeline_name>_indexing.py # Phase 1: Indexing script ├── <pipeline_name>_rag.py # Phase 2: RAG execution script ├── <pipeline_name>_indexing_config.yml # Indexing configuration └── <pipeline_name>_rag_config.yml # RAG execution configuration This standardization enables:
src/rag_pipelines/utils/Sources: README.md118-138
To populate a pipeline's vector database:
The indexing script performs:
UnstructuredChunkerMetadataExtractor (LLM-powered)HuggingFaceEmbeddingsSources: README.md118-127
To run RAG evaluation on a pipeline:
The RAG script executes a LangGraph StateGraph with five sequential nodes:
ContextualRerankerChatGroq LLMFor detailed node specifications, see RAGState and Pipeline Nodes.
Sources: README.md130-138
The repository uses DeepEval (version >=3.7.0 per pyproject.toml28) for comprehensive RAG evaluation:
| Metric Category | Metrics | Purpose |
|---|---|---|
| Retrieval Quality | Contextual Recall, Contextual Precision | Measures quality of retrieved documents |
| Generation Quality | Answer Relevancy, Faithfulness | Measures LLM answer quality and factual grounding |
| Overall Relevancy | Contextual Relevancy | End-to-end relevance assessment |
Evaluation results are traced to Confident AI for debugging and performance analysis.
For evaluation configuration and metric interpretation, see Evaluation and Tracing.
Sources: README.md20-21 README.md50 pyproject.toml28
The repository implements multi-layered quality assurance:
Key quality tools configured in pyproject.toml45-166:
baml_client/, baml_src/)For development setup and contribution guidelines, see Development Guide.
Sources: pyproject.toml45-166 .gitignore1-208
To begin using the repository:
uv package managerFor detailed setup instructions, see Getting Started.
For contributing code or extending pipelines, see Development Guide.
Sources: README.md86-138
Refresh this wiki
This wiki was recently refreshed. Please wait 5 days to refresh again.