Decimal digit token filter
Converts all digits in the Unicode Decimal_Number
General Category to 0-9
. For example, the filter changes the Bengali numeral ৩
to 3
.
This filter uses Lucene’s DecimalDigitFilter.
The following analyze API request uses the decimal_digit
filter to convert Devanagari numerals to 0-9
:
GET /_analyze
{ "tokenizer" : "whitespace", "filter" : ["decimal_digit"], "text" : "१-one two-२ ३" }
The filter produces the following tokens:
[ 1-one, two-2, 3]
The following create index API request uses the decimal_digit
filter to configure a new custom analyzer.
PUT /decimal_digit_example
{ "settings": { "analysis": { "analyzer": { "whitespace_decimal_digit": { "tokenizer": "whitespace", "filter": [ "decimal_digit" ] } } } } }