IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.

similarity

edit

Elasticsearch allows you to configure a text scoring algorithm or similarity per field. The similarity setting provides a simple way of choosing a text similarity algorithm other than the default BM25, such as boolean.

Only text-based field types like text and keyword support this configuration.

Custom similarities can be configured by tuning the parameters of the built-in similarities. For more details about this expert options, see the similarity module.

The only similarities which can be used out of the box, without any further configuration are:

BM25
The Okapi BM25 algorithm. The algorithm used by default in Elasticsearch and Lucene.
boolean
A simple boolean similarity, which is used when full-text ranking is not needed and the score should only be based on whether the query terms match or not. Boolean similarity gives terms a score equal to their query boost.

The similarity can be set on the field level when a field is first created, as follows:

response = client.indices.create( index: 'my-index-000001', body: { mappings: { properties: { default_field: { type: 'text' }, boolean_sim_field: { type: 'text', similarity: 'boolean' } } } } ) puts response
PUT my-index-000001 { "mappings": { "properties": { "default_field": {  "type": "text" }, "boolean_sim_field": { "type": "text", "similarity": "boolean"  } } } }

The default_field uses the BM25 similarity.

The boolean_sim_field uses the boolean similarity.