Posted on Jan 4, 2024

Easy LLM based text classification with instructor_ex

I got lucky today.

Just as I was contemplating how to add grammar and spell checking to my Text-to-Image game, I read about a new library that I could use to solve both problems at the same time.

My game involves trying to find 5 missing words needed to complete a prompt phrase which was used to generate the original image. Every time there is a unique guess, I generate a new image to show the user if they were close. But if a user guesses a phrase that someone else guessed then I can pull the image from cache and don't need to pay for a new generation.

This means that spelling errors cost me money.

And since the target phrase is grammatically correct, any bad grammar is also a waste.

I was considering adding a dictionary to check words against, and I wasn't sure what to do about grammar checking...

Enter instructor_ex and LLMs

I'm a bit ashamed to admit that I didn't first jump to using an LLM to solve this problem - because the answer is pretty obvious now.

Why not just ask an LLM, which understands both grammar and spelling VERY well:
"Is the following phrase grammatically correct and spelled correctly?"

This is the exact type of question and use case that Thomas Millar's new library instructor_ex is designed to make easy.

Step 0 - Add Instructor to your project

Update your mix.exs deps

 defp deps do [ ... {:instructor, "~> 0.0.2"}, ... ]

Add instructor and openai to your config:

config :instructor, adapter: Instructor.Adapters.OpenAI config :openai, api_key: "sk-...8pj", http_options: [recv_timeout: 10 * 60 * 1000]

Note You can tell from this config that OpenAI is not the only adapter and LLM you can use.

Step 1 - Create a Grammar classifier

Instructor follows the ecto schema pattern, which I believe could be extremely helpful for more complex use cases. In my use case, the effort was minimal and definition was pretty clear.

The one part that was a bit confusing was that the question being asked of the LLM is embedded in the @doc string for the schema. Here's my classifier:

defmodule GrammarClassification do use Ecto.Schema @doc """ A classification of whether or not a provided phrase is grammatically correct and has correct spelling """ @primary_key false embedded_schema do field(:is_correct?, :boolean) end end

Step 2 - Classify some text and get a result

I then wrapped the calling of the classifier in a simple helper function which follows the exact format from the example notebooks found in project:

defmodule Teleprompt.TextHelper do def is_correct?(phrase) do {:ok, %{is_correct?: result}} = Instructor.chat_completion( model: "gpt-3.5-turbo", response_model: GrammarClassification, messages: [ %{ role: "user", content: "Classify the following text: #{phrase}" } ] ) result end end

Now I can simply call a single function, pass in a string of words, and receive a true/false response on whether or not it is spelled correctly and grammatically correct.

Hell yeah!

Top comments (1)

aaronblondeau • Jan 23 '24

I would've never known about instructor_ex (or Instructor or instructor-js). Haven't been totally happy with LangChain's documentation and syntax so going to give those a look - thanks!