Intro of Retrieval Augmented Generation (RAG)

Uploaded by

abhi.acem14

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views29 pages

Intro of Retrieval Augmented Generation (RAG)

Uploaded by

abhi.acem14

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Intro of Retrieval Augmented

Generation (RAG) and

application demos
by Henry Heng LUO
Content
vRAG Summary
vHands-on practices
1. PRACTICE of Basic RAG pipeline
2. PRACTICE of Sentence-window retrieval pipeline
3. PRACTICE of Auto-merging retrieval pipeline
RAG Summary
• Large Language Models are with intrinsic flaws.
• They are can produce misleading "hallucinations"
• They rely on potentially outdated information
• They are inefficient when dealing with specific knowledge
• They lack depth in specialized fields
• They fall short in reasoning abilities
• They lack controllability
• They cannot trace the knowledge source
• They cannot protect data privacy
• They are with high cost to train
• Retrieval-Augmented Generation (RAG) significantly improves the precision and pertinence of
content by first retrieve relevant information from an external database of documents prior to
the language model's answer generation.
RAG Summary
RAG Summary
RAG Summary
• Basic RAG
• The classic basic RAG process, also known as Naive RAG, mainly
includes three basic steps:
1. Indexing -Splitting the document corpus into shorter chunks and building a
vector index through an encoder.
2. Retrieval - Retrieving relevant document fragments based on the similarity
between the question and the chunks.
3. Generation - Generating an answer to the question conditioned on the
retrieved context.
RAG Summary
• Advanced RAG
• The Advanced RAG paradigm involves additional processing in Pre-
Retrieval and Post-Retrieval.
1. Before retrieval, methods such as query rewriting, routing, and
expansion can be used to align the semantic differences between questions
and document chunks.
2. After retrieval, rerank the retrieved document corpus can avoid the "Lost in
the Middle" phenomenon, or the context can be filtered and compressed to
shorten the window length.
RAG Summary
• Modular RAG
• Structurally, it is more free and flexible, introducing more specific
functional modules, such as query search engines and the fusion of
multiple answers.
• Technologically, it integrates retrieval with fine-tuning, reinforcement
learning, and other techniques.
• In terms of process, the RAG modules are designed and orchestrated,
resulting in various RAG patterns.
RAG Summary
RAG Summary
• To build a good RAG system, three critical questions need to be
considered:
• What to retrieve?
• When to retrieve?
• How to use the retrieved content?
RAG Summary
• Augmentation Sources. including unstructured data such as text
paragraphs, phrases, or individual words. Structured data can also be
used, such as indexed documents, triple data, or subgraphs, or
retrieving from content generated by LLMs themselves.
• Augmentation Stages. performing during the pre-training, fine-
tuning, and inference stages.
• Augmentation process. The initial retrieval was a once process,
but iterative retrieval, recursive retrieval, and adaptive retrieval
methods, where LLMs decide the timing of retrieval on their own,
gradually emerged in the development of RAG.
RAG Summary
RAG Summary
RAG Summary
• RAG is like giving the model a textbook for customized information
retrieval, which is very suitable for specific queries.
• Fine-tuning is like a student internalizing knowledge over time, better
suited for mimicking specific structures, styles, or formats.
• Depending on their reliance on external knowledge and requirements
for model adjustment, they each have suitable scenarios.
• To use RAG, Fine-tuning, Prompt Engineering together may yield the
best results.
RAG Summary
RAG Summary
RAG Summary
• The evaluation methods for RAG are diverse, mainly including three
quality scores: context relevance, answer fidelity, and answer
relevance.
• The evaluation involves four key capabilities: noise robustness, refusal
ability, information integration, and counterfactual robustness.
• In terms of evaluation frameworks, there are benchmarks such as
RGB and RECALL, as well as automated evaluation tools like RAGAS,
ARES, and TruLens, which help to comprehensively measure the
performance of RAG models.
RAG Summary
RAG Summary
RAG Summary
• To address the current challenges faced by RAG:
• Context length. What to do when the retrieved content is too much and exceeds the window limit? If the
context window of LLMs is no longer limited, how should RAG be improved?
• Robustness. How to deal with incorrect content retrieved? How to filter and validate the retrieved content?
How to enhance the model's resistance to poisoning and noise?
• Coordination with fine-tuning. How to leverage the effects of both RAG and FT simultaneously, how should
they coordinate, organize, whether in series, alternation, or end-to-end?
• Scaling Laws: Does the RAG model satisfy the Scaling Law? Will RAG, or under what scenarios might RAG
experience the phenomenon of Inverse Scaling Law?
• The role of LLMs. LLMs can be used for retrieval (replacing search with LLMs' generation or searching LLMs'
memory), for generation, for evaluation. How to further explore the potential of LLMs in RAG?
• Production-ready. How to reduce the retrieval latency of ultra-large-scale corpora? How to ensure that the
content retrieved is not leaked by LLMs
• Multimodal Expansion. How can the evolving technologies and concepts of RAG be extended to other
modalities of data such as images, audio, video, or code?
RAG Summary
• RAG can be applied to question-answering systems and more: such
as recommendation systems, information extraction, and report
generation.
• The RAG technology stack is booming. In addition to well-known tools
like Langchain and LlamaIndex, the market is seeing an emergence of
more targeted RAG tools, such as customized tools and simplified
tools.
RAG Summary

"Retrieval-Augmented Generation for Large Language Models: A Survey"

PRACTICE of Basic RAG pipeline
• We want to infuse existing database information into the LLM.
• Each query will firstly send to retrieve the context information related
to the existing database, (here vector database can be used), then the
context information is wrapped in the prompt and sent to the LLM.
• Separate the documents into small chunk.
• Search the semantic matched small chunk.
• Return the top-k small chunks.
PRACTICE of Basic RAG pipeline

The same text

chunks are
used in
embeddings
and synthesis
PRACTICE of Sentence-window retrieval pipeline
• This is suitable for plenty context information, instead of only small
chuck information.
• Separate the documents into sentence level.
• Search the semantic matched sentence chunk.
• Retrieve the sentence chunk with the previous and following
sentences window, to form the context chunk.
• Rerank the context chunks.
PRACTICE of Sentence-window retrieval pipeline
PRACTICE of Sentence-window retrieval pipeline

Query: What
are the
concern
surrounding
the AMOC?
PRACTICE of Auto-merging retrieval pipeline
• The small chunk is good to match precisely, but we also need plenty
context information.
• Define a hierarchy of smaller chunks.
• linked to parent chunks. If the set of smaller chunks linking to a
parent chunk exceeds some threshold, then "merge" smaller chunks
into the bigger parent chunk.
• Rerank the final parent chunks.
PRACTICE of Auto-merging retrieval pipeline

Auto-merging

returned chunk

RAG Cheat Sheet-2
No ratings yet
RAG Cheat Sheet-2
29 pages
Retrieval Augmented Generation (RAG) For Everyone
No ratings yet
Retrieval Augmented Generation (RAG) For Everyone
57 pages
Ue21cs421ac1 20240924233834
No ratings yet
Ue21cs421ac1 20240924233834
54 pages
RAG - Genai
No ratings yet
RAG - Genai
11 pages
RAG Technics
100% (1)
RAG Technics
8 pages
RAG Understanding PDF
No ratings yet
RAG Understanding PDF
12 pages
A Deep Dive Into Retrieval Augmented Generation: Team Members
No ratings yet
A Deep Dive Into Retrieval Augmented Generation: Team Members
14 pages
Rag System Notes
No ratings yet
Rag System Notes
26 pages
Llmrag
No ratings yet
Llmrag
6 pages
Rag
No ratings yet
Rag
4 pages
RAG Slide ENG
No ratings yet
RAG Slide ENG
41 pages
RAG Architecture
100% (10)
RAG Architecture
52 pages
Developing Retrieval Augmented Generation (RAG) Based LLM Systems From Pdfs - An Expert Report
No ratings yet
Developing Retrieval Augmented Generation (RAG) Based LLM Systems From Pdfs - An Expert Report
36 pages
Learning: Gen Ai
No ratings yet
Learning: Gen Ai
6 pages
01rag For LLM A Survey
No ratings yet
01rag For LLM A Survey
21 pages
A Simple Guide To Retrieval Augmented Generation 1720484135
No ratings yet
A Simple Guide To Retrieval Augmented Generation 1720484135
9 pages
Semantic Search and Beyond handout-Tim-Clarke
No ratings yet
Semantic Search and Beyond handout-Tim-Clarke
16 pages
Advanced Data Query Techniques
100% (1)
Advanced Data Query Techniques
5 pages
Advanced RAG Techniques Evaluation
No ratings yet
Advanced RAG Techniques Evaluation
14 pages
Chapter 3 Methods
No ratings yet
Chapter 3 Methods
20 pages
RAG Comprehensive Documentation
No ratings yet
RAG Comprehensive Documentation
20 pages
A Survey On Rag Meeting LLM
No ratings yet
A Survey On Rag Meeting LLM
18 pages
RAG and Vector Database Guide
No ratings yet
RAG and Vector Database Guide
18 pages
Retrieval Augmented Generation - A Simple Introduction
No ratings yet
Retrieval Augmented Generation - A Simple Introduction
82 pages
Modular RAG: Transforming RAG Systems Into LEGO-like Reconfigurable Frameworks
No ratings yet
Modular RAG: Transforming RAG Systems Into LEGO-like Reconfigurable Frameworks
17 pages
RAG and Its Variants - Graph RAG Light RAG and Agentic RAG
No ratings yet
RAG and Its Variants - Graph RAG Light RAG and Agentic RAG
16 pages
RAG Deep-Dive Research Report
No ratings yet
RAG Deep-Dive Research Report
46 pages
A Survey On Rag Meeting LLMS: Towards Retrieval-Augmented Large Language Models
No ratings yet
A Survey On Rag Meeting LLMS: Towards Retrieval-Augmented Large Language Models
18 pages
Chapters
No ratings yet
Chapters
7 pages
Building LLM Applications
No ratings yet
Building LLM Applications
14 pages
A Comprehensive Guide To Building Agentic RAG Systems With LangGraph
No ratings yet
A Comprehensive Guide To Building Agentic RAG Systems With LangGraph
23 pages
RAG - A Simple Introduction
100% (6)
RAG - A Simple Introduction
75 pages
A Survey On Retrieval-Augmented Text Generation For Large Language Models
No ratings yet
A Survey On Retrieval-Augmented Text Generation For Large Language Models
18 pages
Transcript For Explaining Retrieval-Augmented Generation (RAG) To Colleagues
No ratings yet
Transcript For Explaining Retrieval-Augmented Generation (RAG) To Colleagues
6 pages
Dyslexic 1
No ratings yet
Dyslexic 1
11 pages
A Taxonomy of Retrieval Augmented Generation
100% (5)
A Taxonomy of Retrieval Augmented Generation
56 pages
RAG Syllabus R&D
No ratings yet
RAG Syllabus R&D
6 pages
RAG for LLMs: A Comprehensive Survey
No ratings yet
RAG for LLMs: A Comprehensive Survey
26 pages
Medium
No ratings yet
Medium
22 pages
A Powerful Technique For Improved Text Generation and Efficiency
No ratings yet
A Powerful Technique For Improved Text Generation and Efficiency
14 pages
7 Agentic RAG System Architectures To Build AI Agents
100% (2)
7 Agentic RAG System Architectures To Build AI Agents
12 pages
Practical RAG
No ratings yet
Practical RAG
127 pages
Master RAG Course
No ratings yet
Master RAG Course
50 pages
Lecture 36 Introduction To Langchain
No ratings yet
Lecture 36 Introduction To Langchain
31 pages
The Power of Noise: Redefining Retrieval For RAG Systems: Florin Cuconasu Giovanni Trappolini Federico Siciliano
No ratings yet
The Power of Noise: Redefining Retrieval For RAG Systems: Florin Cuconasu Giovanni Trappolini Federico Siciliano
11 pages
Retrieval Augmented Generation (Rag) For Precision Language Models
No ratings yet
Retrieval Augmented Generation (Rag) For Precision Language Models
10 pages
12 Essential RAG Types 1735544647
No ratings yet
12 Essential RAG Types 1735544647
29 pages
Minor Proj
No ratings yet
Minor Proj
15 pages
How Build A RAG Agent With LlamaIndex
No ratings yet
How Build A RAG Agent With LlamaIndex
4 pages
Generative AI
No ratings yet
Generative AI
25 pages
External Information On Large Linguistic Models Utilizing Retrieval Enhanced Generation (RAG)
100% (10)
External Information On Large Linguistic Models Utilizing Retrieval Enhanced Generation (RAG)
6 pages
Research Methodology Project
No ratings yet
Research Methodology Project
7 pages
Rag Foundry - Diff Framework
No ratings yet
Rag Foundry - Diff Framework
10 pages
Key Excerpts - A Simple Guide To Rag
No ratings yet
Key Excerpts - A Simple Guide To Rag
3 pages
RAG 570 Hasnad Ahmed2
No ratings yet
RAG 570 Hasnad Ahmed2
9 pages
Challenge
No ratings yet
Challenge
8 pages
Edi 5
No ratings yet
Edi 5
24 pages
Advanced RAG Techniques for LLM Apps
No ratings yet
Advanced RAG Techniques for LLM Apps
54 pages
Hands-On Guide To Agentic Corrective RAG-1
100% (1)
Hands-On Guide To Agentic Corrective RAG-1
5 pages
(Supplement) The Corporate Health Insurance Pricing Dilemma
No ratings yet
(Supplement) The Corporate Health Insurance Pricing Dilemma
5 pages
Measures of Central Tendency Measures of Dispersion - Session Deck
No ratings yet
Measures of Central Tendency Measures of Dispersion - Session Deck
62 pages
1.3 Delegation Summary 16x9
No ratings yet
1.3 Delegation Summary 16x9
1 page
Aggregation & Grouping in Pandas - Session Deck
No ratings yet
Aggregation & Grouping in Pandas - Session Deck
96 pages
Gradient Descent
No ratings yet
Gradient Descent
27 pages
Naive Bayes
No ratings yet
Naive Bayes
11 pages
SA 2 Grade 9
No ratings yet
SA 2 Grade 9
3 pages
Canned Food Quality Factors
No ratings yet
Canned Food Quality Factors
5 pages
Quotation w241100547535 Inbfr Gbsou
No ratings yet
Quotation w241100547535 Inbfr Gbsou
2 pages
Total Station Assignment
100% (1)
Total Station Assignment
2 pages
Technology of Machine Tools 8 Ed Krar
No ratings yet
Technology of Machine Tools 8 Ed Krar
291 pages
Built - in Methods in Python
No ratings yet
Built - in Methods in Python
3 pages
TCB Corporate Gifting Catalogue - Compressed
No ratings yet
TCB Corporate Gifting Catalogue - Compressed
11 pages
Grade 8 Module 1
No ratings yet
Grade 8 Module 1
6 pages
Donation List for Charity Auction
No ratings yet
Donation List for Charity Auction
4 pages
Battery Thermal PPT
No ratings yet
Battery Thermal PPT
4 pages
Project Management Guide
No ratings yet
Project Management Guide
29 pages
M.Com. Marginal Costing Guide
No ratings yet
M.Com. Marginal Costing Guide
16 pages
Yarn Blending for Quality Control
No ratings yet
Yarn Blending for Quality Control
1 page
1.3.8 Write - Prepare A Personal Narrative (Writing Guide)
No ratings yet
1.3.8 Write - Prepare A Personal Narrative (Writing Guide)
5 pages
Inverter Market Size
No ratings yet
Inverter Market Size
8 pages
30 Ton CU
No ratings yet
30 Ton CU
6 pages
D and F Block Past Papers
No ratings yet
D and F Block Past Papers
4 pages
ALO1 - System Construction v1.3
No ratings yet
ALO1 - System Construction v1.3
45 pages
A Writer S Resource 7th Edition Maimon PDF Download
No ratings yet
A Writer S Resource 7th Edition Maimon PDF Download
65 pages
Reviewer Pe
No ratings yet
Reviewer Pe
9 pages
Geotek - 05 - Rock Mass Properties Classification & Estimation.
No ratings yet
Geotek - 05 - Rock Mass Properties Classification & Estimation.
94 pages
5 Tenses
No ratings yet
5 Tenses
3 pages
Lafayette College Engineering Paper
100% (3)
Lafayette College Engineering Paper
8 pages
Accepted Manuscript: 10.1016/j.jlp.2017.09.011
No ratings yet
Accepted Manuscript: 10.1016/j.jlp.2017.09.011
24 pages
Geography - Student Copy-Jeet-Rana-GS
No ratings yet
Geography - Student Copy-Jeet-Rana-GS
18 pages
Projek Uas Bank Azizah
No ratings yet
Projek Uas Bank Azizah
11 pages
Baseball and Cultural Insights
No ratings yet
Baseball and Cultural Insights
4 pages
Change Request Form Restore (New)
No ratings yet
Change Request Form Restore (New)
1 page
Instructors Notes Excel 2016 Level 1
No ratings yet
Instructors Notes Excel 2016 Level 1
17 pages
Optima 5X00 With Ethernet Detector Replacement OPT5010A
No ratings yet
Optima 5X00 With Ethernet Detector Replacement OPT5010A
8 pages