To configure indexes for similarity searches, you need to configure the following fields.
For instructions on how to configure an index, see Configure index parameters.
NearestNeighborSearch
| Fields | |
|---|---|
contentsDeltaUri |
Allows inserting, updating or deleting the contents of the Vector Search If you set this field when calling |
isCompleteOverwrite |
If this field is set together with |
config | The configuration of the Vector Search |
NearestNeighborSearchConfig
| Fields | |
|---|---|
dimensions |
Required. The number of dimensions of the input vectors. Used for dense embeddings only. |
approximateNeighborsCount |
Required if tree-AH algorithm is used. The default number of neighbors to find through approximate search before exact reordering is performed. Exact reordering is a procedure where results returned by an approximate search algorithm are reordered using a more expensive distance computation. |
ShardSize | ShardSize The size of each shard. When an index is large, it is sharded based on the specified shard size. During serving, each shard is served on a separate node and scales independently. |
distanceMeasureType | The distance measure used in nearest neighbor search. |
featureNormType | Type of normalization to be carried out on each vector. |
algorithmConfig | oneOf: The configuration for the algorithms that Vector Search uses for efficient search. Used for dense embeddings only.
|
DistanceMeasureType
| Enums | |
|---|---|
SQUARED_L2_DISTANCE | Euclidean (L2) Distance |
L1_DISTANCE | Manhattan (L1) Distance |
DOT_PRODUCT_DISTANCE | Default value. Defined as a negative of the dot product. |
COSINE_DISTANCE | Cosine Distance. We strongly suggest using DOT_PRODUCT_DISTANCE + UNIT_L2_NORM instead of the COSINE distance. Our algorithms have been more optimized for the DOT_PRODUCT distance, and when combined with UNIT_L2_NORM, it offers the same ranking and mathematical equivalence as the COSINE distance. |
ShardSize
| Enums | |
|---|---|
SHARD_SIZE_SMALL | 2 GiB per shard |
SHARD_SIZE_MEDIUM | 20 GiB per shard |
SHARD_SIZE_LARGE | 50 GiB per shard |
FeatureNormType
| Enums | |
|---|---|
UNIT_L2_NORM | Unit L2 normalization type. |
NONE | Default value. No normalization type is specified. |
TreeAhConfig
These are the fields to select for the tree-AH algorithm.
| Fields | |
|---|---|
fractionLeafNodesToSearch | double |
| The default fraction of leaf nodes that any query may be searched. Must be in range 0.0 - 1.0, exclusive. The default value is 0.05 if not set. | |
leafNodeEmbeddingCount | int32 |
| Number of embeddings on each leaf node. The default value is 1000 if not set. | |
leafNodesToSearchPercent | int32 |
Deprecated, use fractionLeafNodesToSearch.The default percentage of leaf nodes that any query may be searched. Must be in range 1-100, inclusive. The default value is 10 (means 10%) if not set. | |
BruteForceConfig
This option implements the standard linear search in the database for each query. There are no fields to configure for a brute force search. To select this algorithm, pass an empty object for BruteForceConfig to algorithmConfig.