Skip to main content

Overview

A vector store stores embedded data and performs similarity search.

Interface

LangChain provides a unified interface for vector stores, allowing you to:
  • add_documents - Add documents to the store.
  • delete - Remove stored documents by ID.
  • similarity_search - Query for semantically similar documents.
This abstraction lets you switch between different implementations without altering your application logic.

Initialization

To initialize a vector store, provide it with an embedding model:
from langchain_core.vectorstores import InMemoryVectorStore vector_store = InMemoryVectorStore(embedding=SomeEmbeddingModel()) 

Adding documents

Add Document objects (holding page_content and optional metadata) like so:
vector_store.add_documents(documents=[doc1, doc2], ids=["id1", "id2"]) 

Deleting documents

Delete by specifying IDs:
vector_store.delete(ids=["id1"]) 
Issue a semantic query using similarity_search, which returns the closest embedded documents:
similar_docs = vector_store.similarity_search("your query here") 
Many vector stores support parameters like:
  • k — number of results to return
  • filter — conditional filtering based on metadata

Similarity metrics & indexing

Embedding similarity may be computed using:
  • Cosine similarity
  • Euclidean distance
  • Dot product
Efficient search often employs indexing methods such as HNSW (Hierarchical Navigable Small World), though specifics depend on the vector store.

Metadata filtering

Filtering by metadata (e.g., source, date) can refine search results:
vector_store.similarity_search(  "query",  k=3,  filter={"source": "tweets"} ) 

Top integrations

Select embedding model:
pip install -qU langchain-openai 
import getpass import os  if not os.environ.get("OPENAI_API_KEY"):  os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")  from langchain_openai import OpenAIEmbeddings  embeddings = OpenAIEmbeddings(model="text-embedding-3-large") 
pip install -qU "langchain[azure]" 
import getpass import os  if not os.environ.get("AZURE_OPENAI_API_KEY"):  os.environ["AZURE_OPENAI_API_KEY"] = getpass.getpass("Enter API key for Azure: ")  from langchain_openai import AzureOpenAIEmbeddings  embeddings = AzureOpenAIEmbeddings(  azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],  azure_deployment=os.environ["AZURE_OPENAI_DEPLOYMENT_NAME"],  openai_api_version=os.environ["AZURE_OPENAI_API_VERSION"], ) 
pip install -qU langchain-google-genai 
import getpass import os  if not os.environ.get("GOOGLE_API_KEY"):  os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter API key for Google Gemini: ")  from langchain_google_genai import GoogleGenerativeAIEmbeddings  embeddings = GoogleGenerativeAIEmbeddings(model="models/gemini-embedding-001") 
pip install -qU langchain-google-vertexai 
from langchain_google_vertexai import VertexAIEmbeddings  embeddings = VertexAIEmbeddings(model="text-embedding-005") 
pip install -qU langchain-aws 
from langchain_aws import BedrockEmbeddings  embeddings = BedrockEmbeddings(model_id="amazon.titan-embed-text-v2:0") 
pip install -qU langchain-huggingface 
from langchain_huggingface import HuggingFaceEmbeddings  embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2") 
pip install -qU langchain-ollama 
from langchain_ollama import OllamaEmbeddings  embeddings = OllamaEmbeddings(model="llama3") 
pip install -qU langchain-cohere 
import getpass import os  if not os.environ.get("COHERE_API_KEY"):  os.environ["COHERE_API_KEY"] = getpass.getpass("Enter API key for Cohere: ")  from langchain_cohere import CohereEmbeddings  embeddings = CohereEmbeddings(model="embed-english-v3.0") 
pip install -qU langchain-mistralai 
import getpass import os  if not os.environ.get("MISTRALAI_API_KEY"):  os.environ["MISTRALAI_API_KEY"] = getpass.getpass("Enter API key for MistralAI: ")  from langchain_mistralai import MistralAIEmbeddings  embeddings = MistralAIEmbeddings(model="mistral-embed") 
pip install -qU langchain-nomic 
import getpass import os  if not os.environ.get("NOMIC_API_KEY"):  os.environ["NOMIC_API_KEY"] = getpass.getpass("Enter API key for Nomic: ")  from langchain_nomic import NomicEmbeddings  embeddings = NomicEmbeddings(model="nomic-embed-text-v1.5") 
pip install -qU langchain-nvidia-ai-endpoints 
import getpass import os  if not os.environ.get("NVIDIA_API_KEY"):  os.environ["NVIDIA_API_KEY"] = getpass.getpass("Enter API key for NVIDIA: ")  from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings  embeddings = NVIDIAEmbeddings(model="NV-Embed-QA") 
pip install -qU langchain-voyageai 
import getpass import os  if not os.environ.get("VOYAGE_API_KEY"):  os.environ["VOYAGE_API_KEY"] = getpass.getpass("Enter API key for Voyage AI: ")  from langchain-voyageai import VoyageAIEmbeddings  embeddings = VoyageAIEmbeddings(model="voyage-3") 
pip install -qU langchain-ibm 
import getpass import os  if not os.environ.get("WATSONX_APIKEY"):  os.environ["WATSONX_APIKEY"] = getpass.getpass("Enter API key for IBM watsonx: ")  from langchain_ibm import WatsonxEmbeddings  embeddings = WatsonxEmbeddings(  model_id="ibm/slate-125m-english-rtrvr",  url="https://us-south.ml.cloud.ibm.com",  project_id="<WATSONX PROJECT_ID>", ) 
pip install -qU langchain-core 
from langchain_core.embeddings import DeterministicFakeEmbedding  embeddings = DeterministicFakeEmbedding(size=4096) 
pip install -qU "langchain[langchain-xai]" 
import getpass import os  if not os.environ.get("XAI_API_KEY"):  os.environ["XAI_API_KEY"] = getpass.getpass("Enter API key for xAI: ")  from langchain.chat_models import init_chat_model  model = init_chat_model("grok-2", model_provider="xai") 
pip install -qU "langchain[langchain-perplexity]" 
import getpass import os  if not os.environ.get("PPLX_API_KEY"):  os.environ["PPLX_API_KEY"] = getpass.getpass("Enter API key for Perplexity: ")  from langchain.chat_models import init_chat_model  model = init_chat_model("llama-3.1-sonar-small-128k-online", model_provider="perplexity") 
pip install -qU "langchain[langchain-deepseek]" 
import getpass import os  if not os.environ.get("DEEPSEEK_API_KEY"):  os.environ["DEEPSEEK_API_KEY"] = getpass.getpass("Enter API key for DeepSeek: ")  from langchain.chat_models import init_chat_model  model = init_chat_model("deepseek-chat", model_provider="deepseek") 
Select vector store:
pip install -qU langchain-core 
from langchain_core.vectorstores import InMemoryVectorStore  vector_store = InMemoryVectorStore(embeddings) 
pip install -qU langchain-astradb 
from langchain_astradb import AstraDBVectorStore  vector_store = AstraDBVectorStore(  embedding=embeddings,  api_endpoint=ASTRA_DB_API_ENDPOINT,  collection_name="astra_vector_langchain",  token=ASTRA_DB_APPLICATION_TOKEN,  namespace=ASTRA_DB_NAMESPACE, ) 
pip install -qU langchain-chroma 
from langchain_chroma import Chroma  vector_store = Chroma(  collection_name="example_collection",  embedding_function=embeddings,  persist_directory="./chroma_langchain_db", # Where to save data locally, remove if not necessary ) 
pip install -qU langchain-community 
import faiss from langchain_community.docstore.in_memory import InMemoryDocstore from langchain_community.vectorstores import FAISS  embedding_dim = len(embeddings.embed_query("hello world")) index = faiss.IndexFlatL2(embedding_dim)  vector_store = FAISS(  embedding_function=embeddings,  index=index,  docstore=InMemoryDocstore(),  index_to_docstore_id={}, ) 
pip install -qU langchain-milvus 
from langchain_milvus import Milvus  URI = "./milvus_example.db"  vector_store = Milvus(  embedding_function=embeddings,  connection_args={"uri": URI},  index_params={"index_type": "FLAT", "metric_type": "L2"}, ) 
pip install -qU langchain-mongodb 
from langchain_mongodb import MongoDBAtlasVectorSearch  vector_store = MongoDBAtlasVectorSearch(  embedding=embeddings,  collection=MONGODB_COLLECTION,  index_name=ATLAS_VECTOR_SEARCH_INDEX_NAME,  relevance_score_fn="cosine", ) 
pip install -qU langchain-postgres 
from langchain_postgres import PGVector  vector_store = PGVector(  embeddings=embeddings,  collection_name="my_docs",  connection="postgresql+psycopg://..." ) 
pip install -qU langchain-postgres 
from langchain_postgres import PGEngine, PGVectorStore  $engine = PGEngine.from_connection_string(  url="postgresql+psycopg://..." )  vector_store = PGVectorStore.create_sync(  engine=pg_engine,  table_name='test_table',  embedding_service=embedding ) 
pip install -qU langchain-pinecone 
from langchain_pinecone import PineconeVectorStore from pinecone import Pinecone  pc = Pinecone(api_key=...) index = pc.Index(index_name)  vector_store = PineconeVectorStore(embedding=embeddings, index=index) 
pip install -qU langchain-qdrant 
from qdrant_client.models import Distance, VectorParams from langchain_qdrant import QdrantVectorStore from qdrant_client import QdrantClient  client = QdrantClient(":memory:")  vector_size = len(embeddings.embed_query("sample text"))  if not client.collection_exists("test"):  client.create_collection(  collection_name="test",  vectors_config=VectorParams(size=vector_size, distance=Distance.COSINE)  ) vector_store = QdrantVectorStore(  client=client,  collection_name="test",  embedding=embeddings, ) 
VectorstoreDelete by IDFilteringSearch by VectorSearch with scoreAsyncPasses Standard TestsMulti TenancyIDs in add Documents
AstraDBVectorStore
Chroma
Clickhouse
CouchbaseSearchVectorStore
DatabricksVectorSearch
ElasticsearchStore
FAISS
InMemoryVectorStore
Milvus
Moorcheh
MongoDBAtlasVectorSearch
openGauss
PGVector
PGVectorStore
PineconeVectorStore
QdrantVectorStore
Weaviate
SQLServer
ZeusDB

All Vectorstores

Activeloop Deep Lake

Alibaba Cloud OpenSearch

AnalyticDB

Annoy

Apache Doris

ApertureDB

Astra DB Vector Store

Atlas

AwaDB

Azure Cosmos DB Mongo vCore

Azure Cosmos DB No SQL

Azure AI Search

Bagel

BagelDB

Baidu Cloud ElasticSearch VectorSearch

Baidu VectorDB

Apache Cassandra

Chroma

Clarifai

ClickHouse

Couchbase

DashVector

Databricks

IBM Db2

DingoDB

DocArray HnswSearch

DocArray InMemorySearch

Amazon Document DB

DuckDB

China Mobile ECloud ElasticSearch

Elasticsearch

Epsilla

Faiss

Faiss (Async)

FalkorDB

Gel

Google AlloyDB

Google BigQuery Vector Search

Google Cloud SQL for MySQL

Google Cloud SQL for PostgreSQL

Firestore

Google Memorystore for Redis

Google Spanner

Google Vertex AI Feature Store

Google Vertex AI Vector Search

Hippo

Hologres

Jaguar Vector Database

Kinetica

LanceDB

Lantern

Lindorm

LLMRails

ManticoreSearch

MariaDB

Marqo

Meilisearch

Amazon MemoryDB

Milvus

Momento Vector Index

Moorcheh

MongoDB Atlas

MyScale

Neo4j Vector Index

NucliaDB

Oceanbase

openGauss

OpenSearch

Oracle AI Vector Search

Pathway

Postgres Embedding

PGVecto.rs

PGVector

PGVectorStore

Pinecone

Pinecone (sparse)

Qdrant

Relyt

Rockset

SAP HANA Cloud Vector Engine

ScaNN

SemaDB

SingleStore

scikit-learn

SQLiteVec

SQLite-VSS

SQLServer

StarRocks

Supabase

SurrealDB

Tablestore

Tair

Tencent Cloud VectorDB

ThirdAI NeuralDB

TiDB Vector

Tigris

TileDB

Timescale Vector

Typesense

Upstash Vector

USearch

Vald

VDMS

Vearch

Vectara

Vespa

viking DB

vlite

Weaviate

Xata

YDB

Yellowbrick

Zep

Zep Cloud

ZeusDB

Zilliz