DEV Community

Cover image for Generating and Storing Google Gemini Embeddings with Vercel AI SDK and Supabase
Daniel Sogl
Daniel Sogl

Posted on • Originally published at codingrules.ai

Generating and Storing Google Gemini Embeddings with Vercel AI SDK and Supabase

Text embeddings are numerical vectors representing concepts like text, enabling AI tasks such as semantic search, recommendations, and clustering. This post guides you through generating text embeddings using Google Gemini via the Vercel AI SDK and storing them in a Supabase Postgres database with the pgvector extension.

Why Vercel AI SDK and Supabase?

  • Vercel AI SDK: Offers a unified, developer-friendly API for interacting with AI providers like Google, simplifying embedding generation. Source: Vercel AI SDK Docs
  • Supabase & pgvector: Provides a scalable Postgres database with pgvector for efficient vector storage and similarity search within the database. Source: Supabase AI Docs

Prerequisites

Step 1: Setup

Install the necessary packages:

pnpm add ai @ai-sdk/google @supabase/supabase-js # or npm install ai @ai-sdk/google @supabase/supabase-js # or yarn add ai @ai-sdk/google @supabase/supabase-js 
Enter fullscreen mode Exit fullscreen mode

Create a .env.local file for your environment variables:

# Required for Vercel AI SDK (Google) GOOGLE_API_KEY='your-google-ai-api-key' # Required for Supabase client NEXT_PUBLIC_SUPABASE_URL='your-supabase-project-url' # Public key for client-side reads (e.g., semantic search) NEXT_PUBLIC_SUPABASE_ANON_KEY='your-supabase-anon-key' # Service role key for server-side writes (e.g., storing embeddings) SUPABASE_SERVICE_ROLE_KEY='your-supabase-service-role-key' 
Enter fullscreen mode Exit fullscreen mode

Warning:
Protect your API keys. Add .env.local to your .gitignore and use environment variables in your deployment environment.

Initialize the Supabase client. Create separate clients for server-side operations (using the service role key) and client-side operations (using the anon key) if needed, or manage access appropriately based on your architecture.

import { createClient } from '@supabase/supabase-js'; const supabaseUrl = process.env.NEXT_PUBLIC_SUPABASE_URL!; const supabaseAnonKey = process.env.NEXT_PUBLIC_SUPABASE_ANON_KEY!; const supabaseServiceRoleKey = process.env.SUPABASE_SERVICE_ROLE_KEY!; // Client for public read access (e.g., in browser or Edge Functions) export const supabase = createClient(supabaseUrl, supabaseAnonKey); // Client for server-side operations requiring elevated privileges // Use this cautiously and only in secure server environments export const supabaseAdmin = createClient(supabaseUrl, supabaseServiceRoleKey); 
Enter fullscreen mode Exit fullscreen mode

Step 2: Prepare Supabase Database

  1. Enable vector extension: In your Supabase Dashboard, navigate to Database > Extensions, find vector, and enable it.
  2. Create table: Go to the SQL Editor and run the SQL below. Crucially, adjust VECTOR(768) if you plan to use a different Gemini model or optimize/truncate embeddings to a different dimension. text-embedding-004 uses 768 dimensions. Source: Google AI Embeddings
-- Ensure the vector extension is enabled CREATE EXTENSION IF NOT EXISTS vector WITH SCHEMA extensions; -- Create table to store documents and their embeddings -- IMPORTANT: Adjust VECTOR dimensions based on your model and optimization strategy CREATE TABLE documents ( id BIGSERIAL PRIMARY KEY, content TEXT, -- The original text content embedding VECTOR(768) -- Use 768 for text-embedding-004 (default Gemini model) -- Use a smaller value (e.g., 256) if truncating ); -- Optional: Create an index for faster similarity search -- Choose one index type based on your needs: -- HNSW: Good balance of speed and recall (recommended for many use cases) -- Adjust m and ef_construction based on dataset size and desired recall/speed trade-off CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops) -- WITH (m = 16, ef_construction = 64); -- Example parameters -- IVFFlat: Faster builds, potentially lower recall than HNSW -- Adjust 'lists' based on your dataset size (e.g., sqrt(N) where N is # rows) -- CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops) -- WITH (lists = 100); -- Example parameter 
Enter fullscreen mode Exit fullscreen mode
  1. Create Search Function: Still in the SQL Editor, create a stored procedure for efficient similarity search. Again, ensure VECTOR(768) matches the dimension used in your documents table.
-- Function to search for similar documents using cosine similarity CREATE OR REPLACE FUNCTION match_documents ( query_embedding VECTOR(768), -- Match the vector dimension in your table match_threshold FLOAT, -- Similarity threshold (e.g., 0.7) match_count INT -- Max number of results to return ) RETURNS TABLE ( id BIGINT, content TEXT, similarity FLOAT ) LANGUAGE plpgsql AS $$ BEGIN RETURN QUERY SELECT documents.id, documents.content, 1 - (documents.embedding <=> query_embedding) AS similarity -- Cosine similarity FROM documents WHERE 1 - (documents.embedding <=> query_embedding) > match_threshold ORDER BY documents.embedding <=> query_embedding ASC -- Order by distance (closest first) LIMIT match_count; END; $$; 
Enter fullscreen mode Exit fullscreen mode

Step 3: Generate, Optimize (Optional), and Store Embeddings

Create functions to generate, optionally optimize, and store embeddings in Supabase. We'll put these in lib/embeddingUtils.ts.

1. Configuration and Imports

First, import necessary modules and define configuration constants for the embedding model and dimensions.

import { google } from '@ai-sdk/google'; import { embed } from 'ai'; import { supabaseAdmin } from './supabaseClient'; // Use admin client for inserts // Choose your Google embedding model (e.g., text-embedding-004) // Ref: https://ai.google.dev/docs/embeddings#available_models const EMBEDDING_MODEL = google.textEmbeddingModel('text-embedding-004'); // Define the raw dimension of the chosen model const RAW_VECTOR_DIMENSIONS = 768; // Define the target dimension for optimization (if used). Must match DB schema! const OPTIMIZED_VECTOR_DIMENSIONS = 256; // Example: optimize to 256 dimensions 
Enter fullscreen mode Exit fullscreen mode

2. Helper Function: L2 Normalization

This helper function normalizes a vector (scales it to have a length of 1), which is often beneficial for cosine similarity calculations.

// ... imports and config from above function normalizeL2(v: number[]): number[] { const norm = Math.sqrt(v.reduce((sum, val) => sum + val * val, 0)); if (norm === 0) return v; // Avoid division by zero return v.map((val) => val / norm); } 
Enter fullscreen mode Exit fullscreen mode

3. Helper Function: Optimize Embedding

This function optionally truncates an embedding to a target dimension and then normalizes it using the normalizeL2 helper.

// ... imports, config, and normalizeL2 from above /** * Truncates (optional) and normalizes an embedding vector. */ function optimizeEmbedding( embedding: number[], dimension: number = embedding.length, // Default: no truncation ): number[] { if (dimension < 0 || dimension > embedding.length) { console.warn( `Invalid target dimension ${dimension}. Using original length ${embedding.length}.`, ); dimension = embedding.length; } // 1. Truncate if necessary const truncated = dimension === embedding.length ? embedding : embedding.slice(0, dimension); // 2. Normalize const normalized = normalizeL2(truncated); if (normalized.length !== dimension) { console.warn( `Optimization resulted in unexpected dimension: ${normalized.length}`, ); } return normalized; } 
Enter fullscreen mode Exit fullscreen mode

4. Core Function: Embed and Store

This main function orchestrates the process: it takes text, generates the embedding using the Vercel AI SDK, calls the optimization helper if requested, and inserts the original text and final embedding vector into the Supabase table using the admin client.

// ... imports, config, and helpers from above export async function embedAndStore( text: string, optimize: boolean = false, // Set to true to truncate and normalize tableName: string = 'documents', // Make table name configurable contentColumn: string = 'content', embeddingColumn: string = 'embedding', ): Promise<{ success: boolean; error?: Error }> { const cleanedText = text; // Google models handle newlines generally well try { console.log( `Generating Gemini embedding for: "${cleanedText.substring(0, 60)}..."`, ); // Generate the initial embedding const { embedding: rawEmbedding } = await embed({ model: EMBEDDING_MODEL, value: cleanedText, // Optional: Use Google's built-in dimensionality reduction // ...(optimize ? { parameters: { outputDimensionality: OPTIMIZED_VECTOR_DIMENSIONS } } : {}), }); // Basic validation of raw embedding length if ( rawEmbedding.length !== RAW_VECTOR_DIMENSIONS && !optimize /* Add check if using Google's param */ ) { console.warn( `Expected ${RAW_VECTOR_DIMENSIONS} dimensions, got ${rawEmbedding.length}`, ); } // Optimize (truncate/normalize) or just normalize const finalEmbedding = optimize ? optimizeEmbedding(rawEmbedding, OPTIMIZED_VECTOR_DIMENSIONS) : optimizeEmbedding(rawEmbedding); // Normalize only if not truncating // Validate final embedding length const targetDimension = optimize ? OPTIMIZED_VECTOR_DIMENSIONS : RAW_VECTOR_DIMENSIONS; if (finalEmbedding.length !== targetDimension) { throw new Error( `Final embedding dimension mismatch: expected ${targetDimension}, got ${finalEmbedding.length}`, ); } // Store in Supabase console.log(`Storing embedding with ${finalEmbedding.length} dimensions.`); const { error } = await supabaseAdmin.from(tableName).insert({ [contentColumn]: cleanedText, [embeddingColumn]: finalEmbedding, }); if (error) { console.error('Supabase insert error:', error); throw new Error( `Failed to store embedding in Supabase: ${error.message}`, ); } console.log('Successfully stored text and embedding.'); return { success: true }; } catch (error: any) { console.error('Error in embedAndStore process:', error); return { success: false, error: error as Error }; } } /* Example Usage Placeholder: Add example API route/Server Action call here if needed */ 
Enter fullscreen mode Exit fullscreen mode

Optimization Considerations:
If you enable optimization (truncation + normalization), ensure the VECTOR(...) dimension in your documents table and match_documents function exactly matches OPTIMIZED_VECTOR_DIMENSIONS (e.g., 256). If optimization is disabled, ensure they match RAW_VECTOR_DIMENSIONS (768 for text-embedding-004). Consistency is key! Normalization (normalizeL2) is generally beneficial for cosine similarity searches even without truncation. Google's models also support an outputDimensionality parameter in the API call for built-in reduction. Source: Vercel AI SDK - Google Provider

Step 4: Implement Semantic Search

Create a function in lib/semanticSearch.ts to perform semantic search using the stored embeddings and the Supabase RPC function (match_documents).

1. Configuration and Imports

Import dependencies and define configuration matching the embedding generation step.

import { google } from '@ai-sdk/google'; import { embed } from 'ai'; import { supabase } from './supabaseClient'; // Use public client for reads // Match configuration with embedAndStore and DB schema const EMBEDDING_MODEL = google.textEmbeddingModel('text-embedding-004'); const RAW_VECTOR_DIMENSIONS = 768; const OPTIMIZED_VECTOR_DIMENSIONS = 256; // Must match embedAndStore if optimize=true 
Enter fullscreen mode Exit fullscreen mode

2. Re-use Helper Functions

Include the same normalizeL2 and optimizeEmbedding helper functions used during storage to ensure the query vector is processed identically.

// ... imports and config from above // --- Re-use Helper: Normalize L2 --- function normalizeL2(v: number[]): number[] { const norm = Math.sqrt(v.reduce((sum, val) => sum + val * val, 0)); if (norm === 0) return v; return v.map((val) => val / norm); } // --- Re-use Helper: Optimize Embedding --- function optimizeEmbedding( embedding: number[], dimension: number = embedding.length, ): number[] { // (Identical implementation as in embeddingUtils.ts) if (dimension < 0 || dimension > embedding.length) { dimension = embedding.length; } const truncated = dimension === embedding.length ? embedding : embedding.slice(0, dimension); const normalized = normalizeL2(truncated); if (normalized.length !== dimension) { console.warn( `Search query optimization resulted in unexpected dimension: ${normalized.length}`, ); } return normalized; } 
Enter fullscreen mode Exit fullscreen mode

3. Core Search Function

This function takes a search query, generates its embedding, optimizes it (if the stored embeddings were optimized), and then calls the match_documents Supabase RPC function to find similar documents.

// ... imports, config, and helpers from above export async function semanticSearch( query: string, optimize: boolean = false, // MUST match the optimization state used for storing limit: number = 5, threshold: number = 0.7, // Similarity threshold ): Promise<{ results: Array<{ content: string; similarity: number }>; error?: Error; }> { try { // 1. Generate Query Embedding const { embedding: rawQueryEmbedding } = await embed({ model: EMBEDDING_MODEL, value: query, // Optional: Use Google's built-in dimensionality reduction // ...(optimize ? { parameters: { outputDimensionality: OPTIMIZED_VECTOR_DIMENSIONS } } : {}), }); // 2. Optimize Query Embedding (must match storage optimization strategy) const queryEmbedding = optimize ? optimizeEmbedding(rawQueryEmbedding, OPTIMIZED_VECTOR_DIMENSIONS) : optimizeEmbedding(rawQueryEmbedding); // Normalize only // Validate query embedding length const expectedDimension = optimize ? OPTIMIZED_VECTOR_DIMENSIONS : RAW_VECTOR_DIMENSIONS; if (queryEmbedding.length !== expectedDimension) { throw new Error( `Query embedding dimension mismatch: expected ${expectedDimension}, got ${queryEmbedding.length}`, ); } // 3. Call Supabase RPC function const { data, error } = await supabase.rpc('match_documents', { query_embedding: queryEmbedding, match_threshold: threshold, match_count: limit, }); if (error) { console.error('Supabase RPC error:', error); throw new Error(`Semantic search failed: ${error.message}`); } return { results: data || [] }; } catch (error: any) { console.error('Error in semanticSearch process:', error); return { results: [], error: error as Error }; } } /* Example Usage Placeholder: Add example API route/Page call here if needed */ 
Enter fullscreen mode Exit fullscreen mode

Alternative Approach: Automatic Database-Driven Embeddings

While the application-level approach works well for many use cases, we can also implement a more automated pattern where the database itself manages the embedding generation process.

How Database-Driven Embeddings Work

Instead of managing embeddings at the application level (as shown in the examples above), we can use PostgreSQL's trigger system to automatically generate and update embeddings when data changes:

  1. When a row is inserted or updated, a database trigger fires
  2. The trigger enqueues an embedding job via pgmq (PostgreSQL Message Queue)
  3. A background worker (pg_cron) processes the queue
  4. The worker calls an Edge Function via pg_net to generate the embedding
  5. The embedding is stored back in the vector column

This pattern has several advantages:

  • No embedding drift: Your vectors automatically stay in sync with source content
  • Fully asynchronous: Write operations aren't slowed down by embedding generation
  • Resilient: Jobs are retried if they fail, ensuring embedding consistency
  • SQL-native: The entire process is managed within PostgreSQL

Implementation Example

Here's a simplified example of how to set up database-driven automatic embeddings:

-- Create a table with a vector column CREATE TABLE documents ( id integer PRIMARY KEY GENERATED ALWAYS AS IDENTITY, title text NOT NULL, content text NOT NULL, embedding vector(768), -- For Google's text-embedding-004 created_at timestamp with time zone DEFAULT now() ); -- Create a function that specifies what content to embed CREATE OR REPLACE FUNCTION embedding_input(doc documents) RETURNS text LANGUAGE plpgsql IMMUTABLE AS $$ BEGIN RETURN '# ' || doc.title || E'\n\n' || doc.content; END; $$; -- Add triggers to handle inserts and updates CREATE TRIGGER embed_documents_on_insert AFTER INSERT ON documents FOR EACH ROW EXECUTE FUNCTION util.queue_embeddings('embedding_input', 'embedding'); CREATE TRIGGER embed_documents_on_update AFTER UPDATE OF title, content ON documents FOR EACH ROW EXECUTE FUNCTION util.queue_embeddings('embedding_input', 'embedding'); 
Enter fullscreen mode Exit fullscreen mode

This approach removes the need for the application-level embedding logic we implemented earlier. When you insert or update a document, the embedding is automatically generated and stored:

-- Insert a document (embedding will be generated asynchronously) INSERT INTO documents (title, content) VALUES ('Understanding Embeddings', 'Embeddings are vector representations of text...'); 
Enter fullscreen mode Exit fullscreen mode

Initially, the embedding column will be null, but within a few seconds, it will be automatically populated by the background process.

This pattern is particularly useful for production systems where you need to ensure embeddings always stay in sync with your content without building complex application logic.

The implementation requires setting up Supabase Edge Functions to handle the actual embedding generation and PostgreSQL extensions like pgmq, pg_cron, and pg_net to manage the asynchronous workload.

Conclusion

You've now learned how to generate Google Gemini embeddings using the Vercel AI SDK, optionally optimize them, store them efficiently in Supabase with pgvector, and perform semantic searches. This powerful combination enables sophisticated AI features like:

  • Meaning-based Search: Go beyond keywords to find truly relevant content.
  • Recommendation Systems: Suggest similar items based on semantic understanding.
  • RAG (Retrieval-Augmented Generation): Find relevant context to enhance LLM responses for Q&A bots.

The Vercel AI SDK simplifies AI model interactions, while Supabase and pgvector provide a robust, scalable backend. Remember to maintain consistency in embedding dimensions between storage and search, especially when using optimization techniques.

Happy building!

Top comments (0)