Want to use faiss for Local RAG? Okay, but where to store my chunks(metadata).
Solution: Connect faiss with sqlite(or any other sql).
How: Keep vectors in faiss, data in sqlite.
Benefits:
Use faiss for vector data(what it has been made for) and sqlite for normal data(what it has been made for).
Often, you already have a database. You might only need to create an additional table or columns.
You get support for full text search in most of the database engines: be it sqlite or postgres.
Of course, you can try pgvector, but using FAISS comes with its own advantages. Ultimately, it depends upon your use case.
Top comments (3)
nice writeup! but just a heads-up — pairing faiss vectors with sqlite like this can silently cause sync issues when chunks change but vector ids don’t update accordingly. happens more than you'd think. we've seen a bunch of hallucinations because of this mismatch.
might be worth adding a note about how to keep the ids + metadata in sync, especially if chunks are reprocessed later.
I get what you mean..i mean i was suggesting a simple setup for local rag, you can define it as a transaction problem and try to solve it but for me I would rather generate a hash for the content, convert to integer and store in both faiss (as id) and sqlite(as content_id)..whenever content changes i need to update the content_id and update faiss..now i understand there might be some rare cases when one succeeds or other fails but i can then easily check if for every content_id in sqlite there is a faiss id...what do you think?
by the way WFGY looks cool!