The AI.CLASSIFY function
This document describes the AI.CLASSIFY function, which uses a Vertex AI Gemini model to classify inputs into categories that you provide. BigQuery automatically structures your input to improve the quality of the classification.
The following are common use cases:
- Retail: Classify reviews by sentiment or classify products by categories.
- Text analysis: Classify support tickets or emails by topic.
Input
AI.CLASSIFY accepts the following types of input:
- Text data from standard tables.
This function passes your input to a Gemini model and incurs charges in Vertex AI each time it's called.
Syntax
AI.CLASSIFY( [ input => ] 'INPUT', [ categories => ] 'CATEGORIES', connection_id => 'CONNECTION' )
Arguments
AI.CLASSIFY takes the following arguments:
- INPUT: a- STRINGor- STRUCTvalue that specifies the input to classify. The input must be the first argument that you specify. You can provide the input value in the following ways:- Specify a STRINGvalue. For example,'apple'.
- Specify a - STRUCTvalue that contains one or more fields. You can use the following types of fields within the- STRUCTvalue:- Field type - Description - Examples - STRING- A string literal, or the name of a - STRINGcolumn.- String literal: - 'apple'
 String column name:- my_string_column- ARRAY<STRING>- You can only use string literals in the array. - Array of string literals: - ['red ', 'apples']- The function combines - STRUCTfields similarly to a- CONCAToperation and concatenates the fields in their specified order. The same is true for the elements of any arrays used within the struct. The following table shows some examples of- STRUCTprompt values and how they are interpreted:- Struct field types - Struct value - Semantic equivalent - STRUCT<STRING>- ('apples')- 'apples' - STRUCT<STRING, STRING>- ('red', ' apples')- 'red apples' - STRUCT<STRING, ARRAY<STRING>>- ('crisp ', ['red', ' apples'])- 'crisp red apples' 
 
- Specify a 
- CATEGORIES: the categories by which to classify the input. You can specify categories with or without descriptions:- With descriptions: Use an - ARRAY<STRUCT<STRING, STRING>>value where each struct contains the category name, followed by a description of the category. The array can only contain string literals. For example, you could use colors to classify sentiment:- [('green', 'positive'), ('yellow', 'neutral'), ('red', 'negative')]- You can optionally name the fields of the struct for your own readability, but the field names aren't used by the function: - [STRUCT('green' AS label, 'positive' AS description), STRUCT('yellow' AS label, 'neutral' AS description), STRUCT('red' AS label, 'negative' AS description)]
- Without descriptions: Use an - ARRAY<STRING>value. The array can only contain string literals. This works well when your categories are self-explanatory. For example, you could use the following categories to classify sentiment:- ['positive', 'neutral', 'negative']
 - To handle input that doesn't closely match a category, consider including an - 'Other'category.
- CONNECTION: a- STRINGvalue specifying the Cloud resource connection to use. The following forms are accepted:- Connection name: - [PROJECT_ID].LOCATION.CONNECTION_ID- For example, - myproject.us.myconnection.
- Fully qualified connection ID: - projects/PROJECT_ID/locations/LOCATION/connections/CONNECTION_ID- For example, - projects/myproject/locations/us/connections/myconnection.
 - Replace the following: - PROJECT_ID: the project ID of the project that contains the connection.
- LOCATION: the location used by the connection.
- CONNECTION_ID: the connection ID—for example,- myconnection.- You can get this value by viewing the connection details in the Google Cloud console and copying the value in the last section of the fully qualified connection ID that is shown in Connection ID. For example, - projects/myproject/locations/connection_location/connections/myconnection.
 
Output
AI.CLASSIFY returns a STRING value containing the provided category that best fits the input.
If the call to Vertex AI is unsuccessful for any reason, such as exceeding quota or model unavailability, then the function returns NULL.
Examples
The following examples show how to use the AI.CLASSIFY function to classify text and images into predefined categories.
Classify text by topic
The following query categorizes BBC news articles into high-level categories:
SELECT  title,  body,  AI.CLASSIFY(  body,  categories => ['tech', 'sport', 'business', 'politics', 'entertainment', 'other'],  connection_id => 'us.example_connection') AS category FROM  `bigquery-public-data.bbc_news.fulltext` LIMIT 100; Classify reviews by sentiment
The following query classifies movie reviews of The English Patient by sentiment according to a custom color scheme. For example, a review that is very positive is classified as 'green'.
SELECT  AI.CLASSIFY(  ('Classify the review by sentiment: ', review),  categories =>   [('green', 'The review is positive.'),  ('yellow', 'The review is neutral.'),  ('red', 'The review is negative.')],  connection_id => 'us.example_connection') AS ai_review_rating,  reviewer_rating AS human_provided_rating,  review, FROM  `bigquery-public-data.imdb.reviews` WHERE  title = 'The English Patient' Locations
You can run AI.CLASSIFY in all of the regions that support Gemini models, and also in the US and EU multi-regions.
Quotas
See Generative AI functions quotas and limits.
What's next
- For more information about using Vertex AI models to generate text and embeddings, see Generative AI overview.
- For more information about using Cloud AI APIs to perform AI tasks, see AI application overview.
- For more information about supported SQL statements and functions for generative AI models, see End-to-end user journeys for generative AI models.