FEATURE: Semantic Search #33
Merged
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
Summary
chatbot:refresh_embeddingsto create embeddingsRequired changes to app.yml
These changes will require careful uninstalling if you wish to remove the bot. See the main README for removal instructions.
This new update brings forum search which requires embeddings and parts of the changes represent a breaking change so listen up!
I use the Postgres extension known as pg_embeddings. This promises vector searches 20x the speed of
pgvectorbut requires a bespoke build.Now needs the following added to
app.ymlin theafter_code:section before the plugins are cloned.(NB you may be able to omit the first three commands if your server can see the
postgresql-server-dev-xpackage)This is necessary to add the
pg_embeddingsextensionCreating the Embeddings
Once built, we need to create the embeddings for all posts, so the bot can find forum information.
Enter the container:
./launcher enter appand run the following rake command:
rake chatbot:refresh_embeddings[1]which at present will run twice due to unknown reason (sorry! feel free to PR) but the
[1]ensures the second time it will only add missing embeddings (ie none immediately after first run).Compared to bot interactions, embeddings are not expensive to create, but do watch your usage on your Open AI dashboard in any case.
NB Embeddings are only created for Posts and only those Posts for which a Trust Level One user would have access. This seemed like a reasonable compromise. It will not create embeddings for posts from Trust Level 2+ only accessible content.
Model considerations
In order to use the bot in agent mode you must select one of the
0613variants for the settingchatbot_open_ai_modelotherwise the agent will not function correctly.