- Notifications
You must be signed in to change notification settings - Fork 25.6k
Open
Labels
:SearchOrg/RelevanceLabel for the Search (solution/org) Relevance teamLabel for the Search (solution/org) Relevance teamenhancement"" muted="" aria-describedby="MDU6TGFiZWwyMzE3NA==-tooltip :R5b96b:">>enhancementTeam:Search - RelevanceThe Search organization Search Relevance teamThe Search organization Search Relevance team
Description
Description
Allow scripted bulk updates on indices with semantic text fields to determine whether a noop
or full reindex of the doc is necessary.
Indexing semantic_text
fields is a resource heavy tasl, a common access pattern would therefor be to only update documents if they have not changed.
Imagine a doc under GET /my-index/_doc/1
{ "hash": "SOME-HASH", "semantic": "TEXT" }
The following successfully results in a noop.
POST /my-index/_update/1 { "scripted_upsert": true, "script": { "source": """ if (ctx.op != 'create') { if (ctx._source.hash == params.hash ) { ctx.op = "noop" } } ctx._source = params.doc """, "params": { "hash": "SOME-HASH", "doc": { "hash": "SOME-HASH", "semantic": "TEXT" } } } }
Doing the same through bulk:
POST /semantic-docs-dev/_bulk {"update":{"_id":"1"}} { "scripted_upsert": true, "script": { "source": "if (ctx.op != 'create') { if (ctx._source.hash == params.hash ) { ctx.op = 'noop' } else { ctx._source = params.doc } }", "params": { "hash": "SOME-HASH", "doc": { "hash": "SOME-HASH-2", "semantic": "DIFFERENT TEXT" } } } }
Will result in:
{ "errors": true, "took": 0, "items": [ { "update": { "_index": "semantic-docs-dev", "_id": "/docs/reference/integrations/sonicwall_firewall", "status": 400, "error": { "type": "status_exception", "reason": "Cannot apply update with a script on indices that contain [semantic_text] field(s)" } } } ] }
I presume this is because of the optimizations listed here: https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/semantic-text#semantic-text-updates are harder/impossible to track using scripts.
It would be great to have a more controlled upsert through the bulk api to conditionally update semantic fields.
- Requiring the script to set
ctx.semantic_update = true
or similar - Exposing the using new options on bulk
Update
upserts directly foregoing scripts alltogether.
Metadata
Metadata
Assignees
Labels
:SearchOrg/RelevanceLabel for the Search (solution/org) Relevance teamLabel for the Search (solution/org) Relevance teamenhancement"" muted="" aria-describedby="MDU6TGFiZWwyMzE3NA==-tooltip :Ra5pmb:">>enhancementTeam:Search - RelevanceThe Search organization Search Relevance teamThe Search organization Search Relevance team