Skip to content

contextcrunch-ai/contextcrunch-langchain-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ContextCrunch-LangChain-python

Integration for ContextCrunch in a LangChain pipeline.

Quickstart

  1. Install this package with pip install contextcrunch-langchain.
  2. Add your ContextCrunch API key in your environment file, such as CONTEXTCRUNCH_API_KEY="aaa-bbb-ccc-ddd"

RAG

You can easily modify an existing RAG pipeline by simply applying a ContextCruncher() to the context before filling the prompt template.

For example, if you are using this example from the LangChain docs, the modified pipeline becomes:

rag_chain = ( {"context": retriever | format_docs, "question": RunnablePassthrough()} | ContextCruncher(compression_ratio=0.95) | prompt | llm | StrOutputParser() )

Conversations

You can use ConversationCruncher() to compress a long message history.

Here is an example using ConversationBufferMemory, which is a LangChain memory module that stores the entire conversation history.

model = ChatOpenAI() prompt = ChatPromptTemplate.from_messages( [ ("system", "Conversation Summary:\n{history}"), ("human", "{input}"), ] ) memory = ConversationBufferMemory() memory.chat_memory.add_user_message("My favourite color is purple, my favourite food is pizza.") memory.chat_memory.add_ai_message("I understand. Your favourite color is purple, and your favourite food is pizza.") chain = ( {'history': RunnableLambda(memory.load_memory_variables) | itemgetter("history"), 'input': RunnablePassthrough()} # Fetch the history, feed the input to the next step | ConversationCruncher() # history and input is compressed, and fed into the prompt template (which takes 'history' and 'input' as inputs). | prompt | model ) chain.invoke("What is favourite color?") # small contexts won't get compressed, so ConversationCruncher() will act as a passthrough.

Usage

ContextCruncher (RAG)

The ContextCruncher is a Runnable Lambda that takes in 2 inputs (as an input dictionary):

  • context: This is the retrieved information from RAG
  • question: The relevant query to find in the data. ContextCrunch uses this narrow down the context to only the most essential parts.

Return ContextCruncher returns a dictionary with:

  • context: The updated (compressed) context.
  • question: The original question (for later uses in a chain).

ConversationCruncher (Chat)

The ConversationCruncher is a Runnable Lambda that takes in 2 inputs (as an input dictionary):

  • history: Your relevant conversation history.
  • input: The most recent user message. ContextCrunch uses this narrow down the conversation history to the parts relevant to the input.

Return ConversationCruncher returns a dictionary with:

  • history: The compressed message history, as a single string. Ideally, you can feed this into a system message indicating that this is the conversation history.
  • input: The user message, unmodified (for later uses in a chain).

ContextCrunchDocumentCompressor (RAG w/ Documents)

As an alternative to ContextCruncher, if your data is already in the form of LangChain Documents, or if you would prefer to work with a larger document compression pipeline, you can use ContextCrunchDocumentCompressor. It takes in a list of documents, and a query string.

Here's how it would look in a typical document compression pipeline:

import bs4 from contextcrunch_langchain_python import ContextCrunchDocumentCompressor from langchain import hub from langchain.text_splitter import RecursiveCharacterTextSplitter from langchain_community.document_loaders import WebBaseLoader from langchain_community.vectorstores import Chroma from langchain_core.output_parsers import StrOutputParser from langchain_core.runnables import RunnablePassthrough from langchain_openai import ChatOpenAI, OpenAIEmbeddings cc_compressor = ContextCrunchDocumentCompressor(compression_ratio=0.8) ## Make sure to initialize base retriever as you would like, for example, here is a web scraper/splitter with chromadb, from the LangChain docs loader = WebBaseLoader( web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",), bs_kwargs=dict( parse_only=bs4.SoupStrainer( class_=("post-content", "post-title", "post-header") ) ), ) docs = loader.load() text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200) splits = text_splitter.split_documents(docs) vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings()) base_retriever = vectorstore.as_retriever() # now we wrap the base_retrievier in a compression retriever using ContextCrunch retriever = ContextualCompressionRetriever(base_compressor=cc_compressor, base_retriever=base_retriever) llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0) def format_docs(docs): return "\n\n".join(doc.page_content for doc in docs) rag_chain = ( {"context": retriever | format_docs, "question": RunnablePassthrough()} | prompt | llm | StrOutputParser() ) rag_chain.invoke("What is Task Decomposition?")

Compresion Ratio

When initializing both ContextCruncher(), ConversationCruncher(), or ContextCrunchDocumentCompressor() there is also an optional compression_ratio parameter that controls how aggresively the algorithm should compress. The general trend is the higher the compression ratio, the less information is retained. Generally, a compression ratio of 0.9 is a good start, though for small contexts, the algorithm may compress less than requested compression ratio.

About

Compress Large Contexts In LangChain and Save $$

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages