Query expansion is the incremental loosening of query constraints to include more results when none or too few are initially found. This leads to an adjustment of the result size per query.
When there are no relevant documents at all for a query, Query expansion returns less relevant documents to ensure that query does not return zero search results.
Query expansion tutorial
This tutorial shows you how to enable the query-expansion feature. When a shopper uses an ambiguous or a multi-word search phrase, they can get an empty response. After turning on query expansion, the request is analyzed and the expanded list of products based on the parsed search query gets returned.
To follow step-by-step guidance for this task directly in the Cloud Shell Editor, click Guide me:
Query expansion overview
Query expansion is a powerful feature designed to improve search recall and prevent zero-result scenarios, particularly for long-tail or complex user queries.
Instead of returning no results when an exact match isn't found in the product catalog, query expansion identifies and displays related or alternative products. This enhances user experience and can increase conversion rates.
Key use cases for query expansion are:
- Long-tail queries: For highly specific searches, such as diabetic high-protein low-fat organic milk, the catalog might not have a perfect match. Query expansion can return products that match parts of the intent, such as products tagged with attributes or attribute values of diabetic-friendly milk, high-protein milk.
- Alternative products: If your users search for a brand or product not in the catalog, such as Starbucks coffee 100ml, query expansion can suggest alternative coffee brands available for purchase, preventing a dead-end search.
These next sections outline the functionality, trigger mechanism, and configuration nuances of the query-expansion feature in Vertex AI Search for commerce, with a specific focus on the critical role of the canonical filter.
Query expansion trigger mechanism
The decision to activate query expansion or a given search query is automated and based on a quickly configurable canonical filter rule.
- Trigger condition: Query expansion is triggered only if the initial search with the canonical filter query yields fewer than 3 product results.
- Threshold: This threshold of 3 by default can be changed. It works well for most ecommerce use cases, ensuring that query expansion is activated only if and when the initial result set is genuinely sparse.
Understand the difference between top-level and canonical filters
To correctly implement query expansion, it is crucial to understand the two primary filter parameters in a search request.
Top-level filter (
filter): This is the main filter applied to the search results before they return to the user. It's a combination of two potential layers:Business/base filter: Predefined rules applied to all searches, often without direct user input, such as
inStock=TRUE, category="groceries", storeId="XYZ".User-selected facet filters: Filters applied dynamically by the user while interacting with the search interface, such as by selecting facet filters for brand="Adidas", size="L".
Canonical filters (
canonical_filter): This is a special-purpose filter used exclusively by the query-expansion decision module. Its sole job is to define the catalog view against which the query expansion trigger condition (less than five results) is evaluated.
Core canonical filter functions
The canonical filter is designed to distinguish between an organically poor search result and a result set that was intentionally narrowed by the user.
Scenario 1
- User journey: User searches for t-shirt and gets thousands of results. There's a default
business_filter, such that, for example, it filters only in-stock products and products that match a custom store-level attribute. User then applies facet filters for brand="Adidas" and size="L", which reduces the result count to two. - If
canonical_filteris the same asfilter: The query-expansion decision system would see only 2 results and incorrectly trigger query expansion, showing related but irrelevant products, such Nike t-shirts, which disrespects the user's explicit filtering. - The correct setting is
canonical_filter = business filter: The query-expansion decision should be made based on the initial query excluding user-selected facets.
Scenario 2
- User journey: The user searches for adidas t-shirt with black graphic prints and it yields only one or two results, if any. There is a default
business_filtersuch that, for example, it filters only in-stock products and products that match a custom store level attribute. - If
canonical_filteris not set or not configured correctly, the search with canonical filter could find products that match the query but are out of stock or from a different store, meaning a different value of custom store level attribute. In this case, the query expansion is not triggered. - The correct setting is
canonical_filter = business filter. The query-expansion decision would find fewer than three products for the given long query, then would trigger a query expansion and bring related products to the original query, meaning, products that are in-stock and match the store level attribute. Thus, the search results would be expanded to include black graphic printed t-shirt from a different brand, other colored graphic printed t-shirt, or other t-shirt products from the brand in query.
Query expansion best practices
The canonical filter should almost always be set to be identical to your business or base filter. This ensures that the query-expansion module evaluates the query's potential against the same broad catalog view that your users initially see before they start applying facets.
End-to-end search and query-expansion process flow
When a search request is made, several parallel processes occur:
Request received: The API receives the search request containing the query, the main
filter, and thecanonical_filter.Query-expansion decision search: In parallel, the query-expansion decision module performs its own internal search using the query combined with the canonical filter.
Result count check: The module checks the number of products returned from its internal search.
- If results are five or more: Query expansion is not triggered. The standard search results proceed to the final filtering step.
- If results are less than five: Query expansion is triggered. The model systematically loosens the query to find related products. For example, the model might find for Pixel 5 phone, Pixel 4 phones, Pixel earbuds, or even Samsung phones.
- Final filtering: The product set (the original set or the expanded set from query expansion) is passed to the final stage. In this case, the top-level filter, containing business rules and any user-selected facets, is strictly applied.
- Response sent: The final filtered list of products is returned in the API response.
Advanced use case of selective query-expansion activation
You can strategically configure the filters to enable or disable query expansion for specific parts of your catalog.
Consider the scenario of a large catalog containing groceries, electronics, and fashion apparel. For such a scenario, you want to have the following aspects in mind.
Goal
Enable query expansion for hard-to-find or scarce grocery queries but show zero results for electronics or fashion items. The business need here is to only enable query expansion selectively on the groceries part.
Configuration
For this use case scenario, the selective query expansion can be configured as follows:
canonical_filter: Set it to be broad. It should include all categories: Groceries, electronics, and fashion, plus any base rules like stock availability, defining the canonical filter withcategory="groceries" OR category="electronics" OR category="fashion") AND inStock=TRUE)filter: Set it to be narrow, based on the user's context. For a user in the grocery section, the filter would becategory="groceries" AND inStock=TRUE.
How it works
Selective query expansion works in this scenario as follows:
- User searches for "iPhone 20": The query expansion module uses the broad canonical_filter, finds existing iPhone models (< 5 results), and decides not to trigger query expansion. The standard search results (existing iPhones) are then passed to the main filter, which blocks them because
category="electronics"does not matchcategory="groceries". The user correctly sees zero results. - User searches for *high-protein diabetic milk:* The query-expansion module uses the broad
canonical_filterand finds <5 results, thus triggering query expansion by finding related milk products. These products are passed to the main filter. Since they matchcategory="groceries", these products are successfully returned to the user.
By manipulating the scope of the canonical_filter (the decision-making view) and the main filter (the final output view), you gain precise control over the search experience.
Example dataset
This page uses the following dataset as an example. Expand it to view the fields within the sample product description dataset.
Example product dataset
| ID | title | brands | categories | price_info.price |
|---|---|---|---|---|
| "nest_mini_2nd_gen" | "Nest Mini (2nd gen)" | ["Google", "Nest"] | ["Nest > speakers and displays"] | 49.00 |
| "nest_audio" | "Nest Audio" | ["Google", "Nest"] | ["Nest > speakers and displays"] | 99.99 |
| "nest_hub_max" | "Nest Hub Max" | ["Google", "Nest"] | ["Nest > speakers and displays"] | 229.00 |
| "nest_hub" | "Nest Hub" | ["Google", "Nest"] | ["Nest > speakers and displays"] | 88.99 |
| "google_home_max" | "Google Home Max" | ["Google", "Nest"] | ["Nest > speakers and displays"] | 299.00 |
| "google_home_mini" | "Google Home Mini" | ["Google", "Nest"] | ["Nest > speakers and displays"] | 49.00 |
| "google_pixel_5" | "Google Pixel 5" | ["Google", "Pixel"] | ["Pixel > phones"] | 699.00 |
| "google_pixel_4a_with_5g" | "Google Pixel 4a with 5G" | ["Google", "Pixel"] | ["Pixel > phones"] | 499.00 |
| "google_pixel_4a" | "Google Pixel 4a Phones" | ["Google", "Pixel"] | ["Pixel > phones"] | 349.00 |
| "google_pixel_stand" | "Google Pixel Stand" | ["Google", "Pixel"] | ["Pixel > featured accessories"] | 79.00 |
| "google_pixel_buds" | "Google Pixel Buds" | ["Google", "Pixel"] | ["Pixel > featured accessories"] | 179.00 |
| "google_pixel_5_case" | "Google Pixel 5 Case" | ["Google", "Pixel"] | ["Pixel > featured accessories"] | 40.00 |
| "google_pixel_4a_5g_case" | "Google Pixel 4a (5G) Case" | ["Google", "Pixel"] | ["Pixel > featured accessories"] | 40.00 |
| "google_pixel_4a_case" | "Google Pixel 4a Case" | ["Google", "Pixel"] | ["Pixel > featured accessories"] | 40.00 |
Query expansion
Query expansion increases the recall for query terms with few results, especially long tail queries.
This search feature is driven by a specification determining query-expansion conditions. It includes a pinUnexpandedResults option that's off by default. When set to true, it displays unexpanded products at the top of search results. The top is followed by the expanded results.
Java
For example, if you search for Google Pixel 5 without query expansion, the result is restricted to google_pixel_5 IDs. However, with query expansion, you might also get google_pixel_4a_with_5g, google_pixel_4a, and google_pixel_5_case IDs in the example product description dataset as well.