🔗 Part 4: Creating the LangChain Pipeline
In this part, we’ll build the LangChain-powered backend pipeline that connects your chatbot to MongoDB and Pinecone, handles data chunking, generates embeddings, and retrieves relevant responses.
✅ What We'll Cover
- Setting up document loading from MongoDB
- Splitting order data into chunks
- Creating embeddings from chunks
- Storing vectors in Pinecone
- Retrieving relevant chunks for user queries
🧱 1. Load Data from MongoDB
We’ll load order data to feed into LangChain's document processing tools.
// backend/langchain/loadOrders.js const { connectToDatabase } = require('../database/connection'); async function loadOrderDocuments() { const db = await connectToDatabase(); const orders = await db.collection('orders').find().toArray(); return orders.map(order => ({ pageContent: ` Order ID: ${order.orderId} Customer: ${order.customerName} Email: ${order.email} Items: ${order.items.map(i => \`\${i.productName} x\${i.quantity}\`).join(', ')} Total: $\${order.totalAmount} Status: \${order.status} Date: \${order.orderDate.toDateString()} `, metadata: { orderId: order.orderId }, })); } module.exports = { loadOrderDocuments };
✂️ 2. Split Data into Chunks
We use LangChain's text splitter to break content into manageable pieces.
// backend/langchain/splitter.js const { RecursiveCharacterTextSplitter } = require('@langchain/community/text_splitter'); async function splitDocuments(documents) { const splitter = new RecursiveCharacterTextSplitter({ chunkSize: 500, chunkOverlap: 50, }); return await splitter.splitDocuments(documents); } module.exports = { splitDocuments };
🔁 3. Embed & Store in Pinecone
Now we’ll process and store the chunks as vectors.
// backend/langchain/storeChunks.js const { OpenAIEmbeddings } = require('@langchain/openai'); const { PineconeStore } = require('@langchain/pinecone'); const { initPinecone } = require('./config'); async function storeChunksInPinecone(chunks) { const embeddings = new OpenAIEmbeddings({ openAIApiKey: process.env.OPENAI_API_KEY, }); const pinecone = await initPinecone(); const index = pinecone.Index("ecommerce-orders"); await PineconeStore.fromDocuments(chunks, embeddings, { pineconeIndex: index, }); console.log("Chunks stored in Pinecone."); } module.exports = { storeChunksInPinecone };
🧪 4. Pipeline Runner
Let’s put it all together:
// backend/langchain/pipeline.js const { loadOrderDocuments } = require('./loadOrders'); const { splitDocuments } = require('./splitter'); const { storeChunksInPinecone } = require('./storeChunks'); async function runLangChainPipeline() { const docs = await loadOrderDocuments(); const chunks = await splitDocuments(docs); await storeChunksInPinecone(chunks); } runLangChainPipeline();
Run the pipeline:
node backend/langchain/pipeline.js
✅ Next Steps (Part 5)
In the next part, we will:
- Design prompt templates for order-related queries
- Handle multi-turn conversations
- Implement memory using LangChain for context retention
🚀 Stay tuned for Part 5: Designing Conversational Logic!
Top comments (0)