- Supports CommonJS and ESM.
- Uses @anush008/tokenizers multi-arch native bindings for @huggingface/tokenizers.
- Supports batch embedddings with generators.
The default model is Flag Embedding, which is top of the MTEB leaderboard.
- Python 🐍: fastembed
- Rust 🦀: fastembed-rs
- Go 🐳: fastembed-go
- BAAI/bge-base-en
- BAAI/bge-base-en-v1.5
- BAAI/bge-small-en
- BAAI/bge-small-en-v1.5 - Default
- BAAI/bge-base-zh-v1.5
- sentence-transformers/all-MiniLM-L6-v2
- intfloat/multilingual-e5-large
To install the FastEmbed library, npm works:
npm install fastembedimport { EmbeddingModel, FlagEmbedding } from "fastembed"; // For CommonJS // const { EmbeddingModel, FlagEmbedding } = require("fastembed) const embeddingModel = await FlagEmbedding.init({ model: EmbeddingModel.BGEBaseEN }); let documents = [ "passage: Hello, World!", "query: Hello, World!", "passage: This is an example passage.", // You can leave out the prefix but it's recommended "fastembed-js is licensed under MIT" ]; const embeddings = embeddingModel.embed(documents, 2); //Optional batch size. Defaults to 256 for await (const batch of embeddings) { // batch is list of Float32 embeddings(number[][]) with length 2 console.log(batch); }const embeddings = embeddingModel.passageEmbed(listOfLongTexts, 10); //Optional batch size. Defaults to 256 for await (const batch of embeddings) { // batch is list of Float32 passage embeddings(number[][]) with length 10 console.log(batch); } const queryEmbeddings: number[] = await embeddingModel.queryEmbed(userQuery); console.log(queryEmbeddings)It's important we justify the "fast" in FastEmbed. FastEmbed is fast because:
- Quantized model weights
- ONNX Runtime which allows for inference on CPU, GPU, and other dedicated runtimes
- No hidden dependencies via Huggingface Transformers
- Better than OpenAI Ada-002
- Top of the Embedding leaderboards e.g. MTEB
MIT © 2023