- Notifications
You must be signed in to change notification settings - Fork 513
[teleport] Update event-groups ingest pipeline to manage cloud fields if already present #12851
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1449981 to 69b1050 Compare 🚀 Benchmarks reportTo see the full report comment with |
| # This could fail if `cloud.account.id` field already exists in the doc. Forced to override the value to avoid throw an exception. | ||
| override: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cloud.instance.id could be set in three different processors using different source fields:
- teleport.audit.aws_host
- teleport.audit.db_gcp_instance_id
- teleport.audit.instance_id
| - rename: | ||
| field: teleport.audit.region | ||
| target_field: cloud.region | ||
| ignore_missing: true | ||
| ignore_failure: true | ||
| # This could fail if `cloud.region` field already exists in the doc. Forced to override the value to avoid throw an exception. | ||
| override: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cloud.region could be set in three different processors using different source fields:
- teleport.audit.aws_region
- teleport.audit.db_aws_region
- teleport.audit.region
| target_field: cloud.account.id | ||
| ignore_missing: true | ||
| # This could fail if `cloud.account.id` field already exists in the doc. Forced to override the value to avoid throw an exception. | ||
| override: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no other processor updating cloud.account.id.
| target_field: cloud.service.name | ||
| ignore_missing: true | ||
| # This could fail if `cloud.service.name` field already exists in the doc. Forced to override the value to avoid throw an exception. | ||
| override: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No other processor updating cloud.service.name.
| target_field: cloud.instance.id | ||
| ignore_missing: true | ||
| # This could fail if `cloud.instance.id` field already exists in the doc. Forced to override the value to avoid throw an exception. | ||
| override: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding override: true into these rename processors makes that the corresponding cloud field values will be set with the value of the latest rename processor executed.
Is that ok/expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Direct link to the discussion in the original PR: #12624 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we enabled an approach like the solution that was implemented for the qualys case (user can specify what they are interested in), or a hard "use the data from the data source" approach, then those two fields cease to be an issue. The remaining fields,
cloud.regionandcloud.instance.idlook like they would then just fall out (depending on how teleport handles theteleport.audit.{region,instance_id}cases (are they disjoint with the explicitly named fields,teleport.audit.{aws,db_aws}andteleport.audit.{aws_host,db_gcp_instance_id}respectively. Unfortunately the test cases don't give any indication about this and I cannot find any reference for it in the teleport documentation; my feeling is that the more generic case should be overridden by the more specific case, but if an override occurs, then an error should be written in and the pipeline be continued. In the case that the we see apparently inconsistent fields being set (e.g.teleport.audit.aws_host,teleport.audit.db_gcp_instance_id) this would just be an error. I'm not sure what the behaviour ofteleport.audit.aws_regionv.teleport.audit.db_aws_regionshould be; are these mutually exclusive?
So, I don't have neither the context about teleport to answer that question nor to provide an alternative solution for the errors found in this ingest pipeline 😞
According to what you're suggesting (or at least I understand about it), do you mean to add an on_failure action in each rename processor modified here?
However, how could it be distinguished that the value to be override is the one set by filebeat or the one set by a previous processor?
Could I get some help here to move forward? 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option, it could be done the opposite, add ignore_failure: true instead of overwrite. IIUC that would keep the values set by filebeat instead, losing the values from teleport data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The answer to the question is unfortunately a question. The issue is one of whether we know what the user wants from the data set. In the case of Qualys it was fairly clear that at least some users wanted the data source's values for these fields rather than the values that are added by the agent processor. ISTM that this is likely to be the case here too, but I don't know this for a fact.
go.mod Outdated
| sigs.k8s.io/yaml v1.4.0 // indirect | ||
| ) | ||
| | ||
| replace github.com/elastic/elastic-package => github.com/elastic/elastic-package v0.109.2-0.20250220090015-bbd8bcadb505 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be removed before merging.
The same for the changes in files under .buildkite folder.
These are required to test with the validation based on mappings.
| Pinging @elastic/security-service-integrations (Team:Security-Service Integrations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Today, we know that no overrides are happening because that would result in an error.
Instead of adding override: true, which could change the behavior of cloud fields, how about adding a guard that checks if the target field exists to prevent the error (e.g., if: ctx.cloud?.x == null)?
I think the ideal solution is the one that was discussed, where the user gets to make the choice (i.e., what was added to Qualys). If we can't get this solution in time to unblock @mrodm's work on enhancing elastic-package, then I propose we replace override: true with an if, and then open an issue for the ideal solution.
side-note: We need to audit every system test that uses canned logs to find the ones that are missing assert.hit_count.
Ok, I'll replace those As a note, there would be some kind of change of behavior due to the fact that until now that pipeline was interrupted with an error if there ware |
Applied changes to use Created issue #12918 |
| if: ctx.cloud?.region == null | ||
| - remove: | ||
| field: teleport.audit.region | ||
| ignore_missing: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using if conditions, as the source fields is not renamed, it must be added a new remove processor for each one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be in a single remove processor block, but I think this is less prone to future damage.
Co-authored-by: Dan Kortschak <dan.kortschak@elastic.co>
💚 Build Succeeded
History
cc @mrodm |
|
| @efd6 @andrewkroh could I get another review for this PR? 😊 Thanks in advance!! |
efd6 left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
| if: ctx.cloud?.region == null | ||
| - remove: | ||
| field: teleport.audit.region | ||
| ignore_missing: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be in a single remove processor block, but I think this is less prone to future damage.
| Package teleport - 1.2.2 containing this change is available at https://epr.elastic.co/package/teleport/1.2.2/ |
… if already present (#12851) Update event-groups ingest pipeline to manage cloud fields (like cloud.region, cloud.istance.id...) if they were already set by filebeat or other processors. This PR keeps the first value of those cloud.* fields by adding the required if conditions to the rename processors. It also adds the needed remove processors to remove the fields that were not renamed. These errors were detected while testing the validation based on mappings: https://buildkite.com/elastic/integrations/builds/22313 To ensure that these errors are caught by elastic-package it has been added the required system test setting to wait for the 270 documents defined in "_dev/deploy/docker/sample_logs/" folder. --------- Co-authored-by: Dan Kortschak <dan.kortschak@elastic.co>




Proposed commit message
Update event-groups ingest pipeline to manage cloud fields (like cloud.region, cloud.istance.id...) if they were already set by filebeat or other processors.
These errors were detected while testing the validation based on mappings: https://buildkite.com/elastic/integrations/builds/22313
But sometimes it was not detected, and it was due to
elastic-packagewas not waiting to get all the docs defined in the test folder to be ingested. After being ingested a few documents,elastic-packagestarted the validation of those docs found.This PR has two goals:
_dev/deploy/docker/sample_logs/.adding theoverridesetting into the rename processors to avoid throwing exceptions.IMPORTANT: The corresponding cloud fields will be set with the value of the last rename processor.ifsetting into the rename processors to avoid throwing exceptions if those fields were already present in the document.Errors found when
elastic-packagewaits for all 270 documents (buildkite link - pr link):In another test running just validation based on mappings, these were the fields undefined (buildkite link):
Related discussions from previous PR:
Checklist
changelog.ymlfile.Author's Checklist
elastic-packageenabled mappings https://buildkite.com/elastic/integrations/builds/22587.buildkitefolder andgo.mod/go.sumfiles.How to test this PR locally
Related issues