Loading

kuromoji_readingform token filter

The kuromoji_readingform token filter replaces the token with its reading form in either katakana or romaji. It accepts the following setting:

use_romaji
Whether romaji reading form should be output instead of katakana. Defaults to false.

When using the pre-defined kuromoji_readingform filter, use_romaji is set to true. The default when defining a custom kuromoji_readingform, however, is false. The only reason to use the custom form is if you need the katakana reading form:

 PUT kuromoji_sample { "settings": { "index": { "analysis": { "analyzer": { "romaji_analyzer": { "tokenizer": "kuromoji_tokenizer", "filter": [ "romaji_readingform" ] }, "katakana_analyzer": { "tokenizer": "kuromoji_tokenizer", "filter": [ "katakana_readingform" ] } }, "filter": { "romaji_readingform": { "type": "kuromoji_readingform", "use_romaji": true }, "katakana_readingform": { "type": "kuromoji_readingform", "use_romaji": false } } } } } } GET kuromoji_sample/_analyze { "analyzer": "katakana_analyzer", "text": "寿司" } GET kuromoji_sample/_analyze { "analyzer": "romaji_analyzer", "text": "寿司" } 
  1. Returns スシ.
  2. Returns sushi.