前言
- 本文对 Elasticsearch 7.17 适用,官方文档 Token count field type
测试
- 新建 mapping
PUT my_index { "mappings": { "properties": { "name": { "type": "keyword", "doc_values": true, "fields": { "length": { "type": "token_count", "analyzer": "standard" } } } } } }- 写入测试数据
POST my_index/_doc/_1 {"name": ["A B", "X Y"]} POST my_index/_doc/2 {"name": ["A B C", "X Y"]}- 查询
GET my_index/_search { "query": { "range": { "name.length": { "gte": 1, "lte": 10 } } }, "_source": [ "*" ], "script_fields": { "token_count": { "script": { "source": "doc['name.length']", "lang": "painless" } } } }查询结果如下
{ "total": { "value": 2, "relation": "eq" }, "max_score": 1.0, "hits": [ { "_index": "my_index", "_type": "_doc", "_id": "_1", "_score": 1.0, "_source": { "name": [ "A B", "X Y" ] }, "fields": { "token_count": [ 2, 2 ] } }, { "_index": "my_index", "_type": "_doc", "_id": "2", "_score": 1.0, "_source": { "name": [ "A B C", "X Y" ] }, "fields": { "token_count": [ 2, 3 ] } } ] }案例
- Elasticsearch 分词匹配能否完整匹配查询 token?有且只有查询 token 的被召回。
- 数据如下
doc1 ["A B", "X Y"] doc2 ["A B C", "X Y"]- 检索 "A B" 或者 "B A" 时,都只应召回 doc1,不应该召回 doc2。
- 使用
nested结合token_count应对 - 创建索引
PUT my_index { "mappings": { "properties": { "name": { "type": "nested", "properties": { "value": { "type": "keyword", "fields": { "text": { "type": "text", "analyzer": "standard" }, "length": { "type": "token_count", "analyzer": "standard" } } } } } } } }- 写入数据
POST my_index/_doc/1 { "name": [ { "value": "A B" }, { "value": "X Y" } ] } POST my_index/_doc/2 { "name": [ { "value": "A B C" }, { "value": "X Y" } ] }- 查看数据
GET my_index/_doc/1?_source=name.value GET my_index/_termvectors/1?fields=name.value.text GET my_index/_search- 查询数据
GET my_index/_search { "query": { "nested": { "path": "name", "query": { "bool": { "filter": [ { "match": { "name.value.text": "A B" } }, { "range": { "name.value.length": { "gte": 2, "lte": 2 } } } ] } } } } }本文出自 qbit snap
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用。你还可以使用@来通知其他用户。