Loading

Unique token filter

Removes duplicate tokens from a stream. For example, you can use the unique filter to change the lazy lazy dog to the lazy dog.

If the only_on_same_position parameter is set to true, the unique filter removes only duplicate tokens in the same position.

Note

When only_on_same_position is true, the unique filter works the same as remove_duplicates filter.

The following analyze API request uses the unique filter to remove duplicate tokens from the quick fox jumps the lazy fox:

 GET _analyze { "tokenizer" : "whitespace", "filter" : ["unique"], "text" : "the quick fox jumps the lazy fox" } 

The filter removes duplicated tokens for the and fox, producing the following output:

 [ the, quick, fox, jumps, lazy ] 

The following create index API request uses the unique filter to configure a new custom analyzer.

 PUT custom_unique_example { "settings" : { "analysis" : { "analyzer" : { "standard_truncate" : { "tokenizer" : "standard", "filter" : ["unique"] } } } } } 
only_on_same_position
(Optional, Boolean) If true, only remove duplicate tokens in the same position. Defaults to false.

To customize the unique filter, duplicate it to create the basis for a new custom token filter. You can modify the filter using its configurable parameters.

For example, the following request creates a custom unique filter with only_on_same_position set to true.

 PUT letter_unique_pos_example { "settings": { "analysis": { "analyzer": { "letter_unique_pos": { "tokenizer": "letter", "filter": [ "unique_pos" ] } }, "filter": { "unique_pos": { "type": "unique", "only_on_same_position": true } } } } }