- Notifications
You must be signed in to change notification settings - Fork 25.5k
Adding NormalizeForStreamProcessor
#125699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @eyalkoren, I've created a changelog YAML for you. |
Pinging @elastic/es-data-management (Team:Data Management) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Eyal, I left some initial comments. I had a question about the way that we nest a document with an existing attributes
field into the OTel attributes. Is this something we want to do? For example this doc:
{ "attributes": { "a": "b", "c": [1, 2, 3] }, "log.level": "1234" }
Becomes, after processing:
{ "resource": { "attributes": {} }, "severity_text": "1234", "attributes": { "attributes.a": "b", "attributes.c": [1, 2, 3] } }
Is that the desired behavior for an existing attributes
field?
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java Outdated Show resolved Hide resolved
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java Outdated Show resolved Hide resolved
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java Outdated Show resolved Hide resolved
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java Outdated Show resolved Hide resolved
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java Outdated Show resolved Hide resolved
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java Outdated Show resolved Hide resolved
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java Outdated Show resolved Hide resolved
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java Outdated Show resolved Hide resolved
modules/ingest-ecs/src/main/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessor.java Outdated Show resolved Hide resolved
modules/ingest-ecs/src/test/java/org/elasticsearch/ingest/ecs/EcsNamespacingProcessorTests.java Outdated Show resolved Hide resolved
Answering the general question:
We started off by trying to be more "clever" about this and merge OTel with non-OTel. Then we had to handle lots of corner cases, like:
And so forth. So last week we decided to change the way we think about it: a document is either sent by an OTel-compliant shipper, or not. If not, no reason to treat the original fields as if they have the OTel sematics. So even if it has a field that has an OTel name, we can consider it to be by chance and namespace it. If that's so- no reason to complicate things for the unlikely event where non-OTel documents contain fields with intended OTel semantics. |
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
…to ECS-namespacing-processor
EcsNamespaceProcessor
NormalizeToStreamProcessor
NormalizeToStreamProcessor
NormalizeForStreamProcessor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened a ticket for us to track adding the periodic ci to run the otel-semver-crawler test bits, but I'm fine with that staying disabled for now on this PR.
I see also that there's a perhaps unfinished conversation about the wording of the documentation. I'm okay if that's either resolved here and now, or if we fuss with the wording of things in a follow up PR.
✅
I believe it's the case that this PR is intended to be backported to 8.19.0, so I'm going to add that label, and also add the label for attempting to automatically backport it. Feel free to remove those labels if this is not actually intended for 8.19.0, though. |
This should go to 8.19 |
CI keeps being unhappy with random stuff, I don't think related to this PR contents. Regarding the followup task of running the test nightly - do you want me to open a GH issue with summary of what I already discussed with the delivery team? |
For my part, yes. 👍 (And quick, it's green!) |
I'll reach out to you offline. |
…or` (#134524) Fixes field querying and writing logic for NormalizeForStreamProcessor so that it can function on both `classic` and `flexible` ingest pipeline access patterns. NormalizeForStreamProcessor was added in #125699 with support for the default ingest node field access logic (now known as `classic` mode). We have since added support for the `flexible` access pattern in ingest pipelines, which allows for querying dotted field names and writing dotted field names when parent path elements are missing. The NormalizeForStreamProcessor was written with the classic access pattern in mind. The processor was designed to look for singular field names and to rely on the classic field writing logic which creates intermediate parent objects when setting a value that is nested in the document. When flexible mode was enabled, the logic did not anticipate dotted field names that could be inconsistently accessible from the source map at certain points in the path notation. Further, the flexible access pattern does not create intermediate parent objects like before. A secondary renaming method was added to take these changes into account.
…or` (elastic#134524) Fixes field querying and writing logic for NormalizeForStreamProcessor so that it can function on both `classic` and `flexible` ingest pipeline access patterns. NormalizeForStreamProcessor was added in elastic#125699 with support for the default ingest node field access logic (now known as `classic` mode). We have since added support for the `flexible` access pattern in ingest pipelines, which allows for querying dotted field names and writing dotted field names when parent path elements are missing. The NormalizeForStreamProcessor was written with the classic access pattern in mind. The processor was designed to look for singular field names and to rely on the classic field writing logic which creates intermediate parent objects when setting a value that is nested in the document. When flexible mode was enabled, the logic did not anticipate dotted field names that could be inconsistently accessible from the source map at certain points in the path notation. Further, the flexible access pattern does not create intermediate parent objects like before. A secondary renaming method was added to take these changes into account.
…or` (elastic#134524) Fixes field querying and writing logic for NormalizeForStreamProcessor so that it can function on both `classic` and `flexible` ingest pipeline access patterns. NormalizeForStreamProcessor was added in elastic#125699 with support for the default ingest node field access logic (now known as `classic` mode). We have since added support for the `flexible` access pattern in ingest pipelines, which allows for querying dotted field names and writing dotted field names when parent path elements are missing. The NormalizeForStreamProcessor was written with the classic access pattern in mind. The processor was designed to look for singular field names and to rely on the classic field writing logic which creates intermediate parent objects when setting a value that is nested in the document. When flexible mode was enabled, the logic did not anticipate dotted field names that could be inconsistently accessible from the source map at certain points in the path notation. Further, the flexible access pattern does not create intermediate parent objects like before. A secondary renaming method was added to take these changes into account.
Namespacing algorithm [EDIT 1]:
resource
exists as a key and the value is a mapresource
either doesn't contain anattributes
field, or contains anattributes
field of type mapscope
is either missing or a mapattributes
is either missing or a mapbody
is either missing or a mapbody
either doesn't contain atext
field, or contains atext
field of typeString
body
either doesn't contain astructured
field, or contains astructured
field that is not of typeString
attributes
mapresource
map with one entry of whichattributes
is the key and a new map as its valueattributes
map:attributes
,resource
,span_id
,body
,severity_text
andtrace_id
attributes
andresource
maps as top level fieldsspan.id
,log.level
) to OTel-compliant names: for each, look for a value first in the nested form and if not found look for a top level dotted field. The first value that is found is used for the renamed field@timestamp
,trace_id
,span_id
,severity_text
,body
,attributes
,resource
andscope
to the newattributes
mapattributes
attributes
toresource.attributes