Fork me on GitHub

Embeddings.js — Simple Text Embeddings library for Node.js

Embeddings.js

GitHub Repo stars NPM Downloads GitHub code size in bytes GitHub License


Embeddings.js is a simple way to get text embeddings in Node.js. Embeddings are useful for text similarity search using a vector database.

await embeddings("Hello World!"); // embedding array

Install

npm install @themaximalist/embeddings.js

To use local embeddings, be sure to install the model as well

npm install @xenova/transformers

Configure

Embeddings.js works out of the box with local embeddings, but if you use the OpenAI or Mistral embeddings you’ll need an API key in your environment.

export OPENAI_API_KEY=<your-openai-api-key> export MISRAL_API_KEY=<your-mistral-api-key>

Usage

Using Embeddings.js is as simple as calling a function with any string.

import embeddings from "@themaximalist/embeddings.js";  // defaults to local embeddings const embedding = await embeddings("Hello World!"); // 384 dimension embedding array

Switching embedding models is easy:

// openai const embedding = await embeddings("Hello World", {  service: "openai" }); // 1536 dimension embedding array  // mistral const embedding = await embeddings("Hello World", {  service: "mistral" }) // 1024 dimension embedding array

Cache

Embeddings.js caches by default, but you can disable it by passing cache: false as an option.

// don't cache (on by default) const embedding = await embeddings("Hello World!", {  cache: false });

The cache file is written to .embeddings.cache.json—you can also delete this file to reset the cache.

API

The Embeddings.js API is a simple function you call with your text, with an optional config object.

await embeddings(  input, // Text input to compute embeddings  {  service: "openai", // Embedding service  model: "text-embedding-ada-002", // Embedding model  cache: true, // Cache embeddings  cache_file: ".embeddings.cache.json", // Cache file  } );

Options

Response

Embeddings.js returns a float[] — an array of floating-point numbers.

[ -0.011776604689657688, 0.024298833683133125, 0.0012317118234932423, ... ]

The length of the array is the dimensions of the embedding. When performing text similarity, you’ll want to know the dimensions of your embeddings to use them in a vector database.

Dimension Embeddings

The Embeddings.js API ensures you have a simple way to use embeddings from multiple providers.

Debug

Embeddings.js uses the debug npm module with the embeddings.js namespace.

View debug logs by setting the DEBUG environment variable.

> DEBUG=embeddings.js* > node src/get_embeddings.js # debug logs

Vector Database

Embeddings can be used in any vector database like Pinecone, Chroma, PG Vector, etc…

For a local vector database that runs in-memory and uses Embeddings.js internally, check out VectorDB.js.

Projects

Embeddings.js is currently used in the following projects:

License

MIT

Author

Created by The Maximalist, see our open-source projects.