- Notifications
You must be signed in to change notification settings - Fork 513
[elastic_security] Initial release of Elastic Security #14305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[elastic_security] Initial release of Elastic Security #14305
Conversation
Add initial release of elastic_security package with a single data stream named alert. This also contains dashboards, ingest pipelines, tests, and readme.
| Pinging @elastic/security-service-integrations (Team:Security-Service Integrations) |
🚀 Benchmarks reportTo see the full report comment with |
Event kind has been updated to alert from signal.
| fields: | ||
| - name: ancestry | ||
| type: keyword | ||
| - name: args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't some of these fields already part of ECS? Can you identify them and remove?
Same for other top level fields that are already in ECS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some fields are being received as string values instead of the expected array format, so we need to explicitly define them in fields.yml.
This behavior has been observed consistently across events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some fields are being received as string values instead of the expected array format
I don't really understand this justification. What do you mean by "expected array" format? Is that specifically for process.args? Many of these fields can simply change to using external: ecs without affecting the field type.
Fields that should use external: ecs
| field | ECS type |
|---|---|
| data_stream.type | constant_keyword |
| data_stream.dataset | constant_keyword |
| data_stream.namespace | constant_keyword |
| @timestamp | date |
| process.args_count | long |
| process.entity_id | keyword |
| process.entry_leader.args | keyword |
| process.entry_leader.args_count | long |
| process.entry_leader.entity_id | keyword |
| process.entry_leader.entry_meta.type | keyword |
| process.entry_leader.executable | keyword |
| process.entry_leader.group.name | keyword |
| process.entry_leader.interactive | boolean |
| process.entry_leader.name | keyword |
| process.entry_leader.parent.entity_id | keyword |
| process.entry_leader.parent.pid | long |
| process.entry_leader.pid | long |
| process.entry_leader.real_group.name | keyword |
| process.entry_leader.real_user.name | keyword |
| process.entry_leader.same_as_process | boolean |
| process.entry_leader.user.name | keyword |
| process.entry_leader.working_directory | keyword |
| process.executable | keyword |
| process.group.name | keyword |
| process.group_leader.args | keyword |
| process.group_leader.args_count | long |
| process.group_leader.entity_id | keyword |
| process.group_leader.executable | keyword |
| process.group_leader.group.name | keyword |
| process.group_leader.interactive | boolean |
| process.group_leader.name | keyword |
| process.group_leader.pid | long |
| process.group_leader.real_group.name | keyword |
| process.group_leader.real_user.name | keyword |
| process.group_leader.same_as_process | boolean |
| process.group_leader.supplemental_groups.name | keyword |
| process.group_leader.user.name | keyword |
| process.group_leader.working_directory | keyword |
| process.hash.md5 | keyword |
| process.hash.sha1 | keyword |
| process.hash.sha256 | keyword |
| process.interactive | boolean |
| process.name | keyword |
| process.parent.args | keyword |
| process.parent.args_count | long |
| process.parent.entity_id | keyword |
| process.parent.executable | keyword |
| process.parent.group.name | keyword |
| process.parent.interactive | boolean |
| process.parent.name | keyword |
| process.parent.pid | long |
| process.parent.real_group.name | keyword |
| process.parent.real_user.name | keyword |
| process.parent.supplemental_groups.name | keyword |
| process.parent.user.name | keyword |
| process.parent.working_directory | keyword |
| process.pid | long |
| process.previous.args | keyword |
| process.previous.args_count | long |
| process.previous.executable | keyword |
| process.real_group.name | keyword |
| process.real_user.name | keyword |
| process.session_leader.args | keyword |
| process.session_leader.args_count | long |
| process.session_leader.entity_id | keyword |
| process.session_leader.executable | keyword |
| process.session_leader.group.name | keyword |
| process.session_leader.interactive | boolean |
| process.session_leader.name | keyword |
| process.session_leader.pid | long |
| process.session_leader.real_group.name | keyword |
| process.session_leader.real_user.name | keyword |
| process.session_leader.same_as_process | boolean |
| process.session_leader.supplemental_groups.name | keyword |
| process.session_leader.user.name | keyword |
| process.session_leader.working_directory | keyword |
| process.supplemental_groups.name | keyword |
| process.user.name | keyword |
| process.working_directory | keyword |
| threat.tactic.id | keyword |
| threat.tactic.reference | keyword |
| threat.tactic.name | keyword |
| threat.technique.id | keyword |
| threat.technique.name | keyword |
| threat.technique.reference | keyword |
| threat.technique.subtechnique.id | keyword |
| threat.technique.subtechnique.name | keyword |
| threat.technique.subtechnique.reference | keyword |
This can be fixed by running
go run github.com/andrewkroh/fydler@main -a useecs -fix packages/elastic_security/**/fields/*.yml
However, the more concerning part are the fields that are declared in conflict with ECS. I think these need to be fixed to use external: ecs which will change their type. Most of these changes will be widening the data type except for the keyword to date changes. Here's a summary:
| field | type | ECS type |
|---|---|---|
| process.command_line | keyword | wildcard |
| process.entry_leader.group.id | long | keyword |
| process.entry_leader.parent.start | keyword | date |
| process.entry_leader.real_group.id | long | keyword |
| process.entry_leader.real_user.id | long | keyword |
| process.entry_leader.start | keyword | date |
| process.entry_leader.user.id | long | keyword |
| process.group.id | long | keyword |
| process.group_leader.group.id | long | keyword |
| process.group_leader.real_group.id | long | keyword |
| process.group_leader.real_user.id | long | keyword |
| process.group_leader.start | keyword | date |
| process.group_leader.supplemental_groups.id | long | keyword |
| process.group_leader.user.id | long | keyword |
| process.parent.command_line | keyword | wildcard |
| process.parent.group.id | long | keyword |
| process.parent.real_group.id | long | keyword |
| process.parent.real_user.id | long | keyword |
| process.parent.start | keyword | date |
| process.parent.supplemental_groups.id | long | keyword |
| process.parent.user.id | long | keyword |
| process.real_group.id | long | keyword |
| process.real_user.id | long | keyword |
| process.session_leader.group.id | long | keyword |
| process.session_leader.real_group.id | long | keyword |
| process.session_leader.real_user.id | long | keyword |
| process.session_leader.start | keyword | date |
| process.session_leader.supplemental_groups.id | long | keyword |
| process.session_leader.user.id | long | keyword |
| process.start | keyword | date |
| process.supplemental_groups.id | long | keyword |
| process.user.id | long | keyword |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewkroh
We're seeing two types of issues or behaviors here:
-
Data type mismatches: Some ECS fields are appearing with a different data type than expected. For example,
process.group.idis coming in as an integer in the raw logs, whereas ECS expects it to be a string. -
Structure mismatches: Some fields, like
threat.techniques, are coming through as arrays of objects in the raw logs, while ECS expects them to be explicitly defined asgroupornestedtypes.
Example here -
{ "threat": [ { "framework": "MITRE ATT&CK", "technique": [ { "reference": "https://attack.mitre.org/techniques/T1059/", "name": "Command and Scripting Interpreter", "subtechnique": [ { "reference": "https://attack.mitre.org/techniques/T1059/004/", "name": "Unix Shell", "id": "T1059.004" }, { "reference": "https://attack.mitre.org/techniques/T1059/006/", "name": "Python", "id": "T1059.006" } ], "id": "T1059" } ], "tactic": { "reference": "https://attack.mitre.org/tactics/TA0002/", "name": "Execution", "id": "TA0002" } }, { "framework": "MITRE ATT&CK", "technique": [ { "reference": "https://attack.mitre.org/techniques/T1132/", "name": "Data Encoding", "subtechnique": [ { "reference": "https://attack.mitre.org/techniques/T1132/001/", "name": "Standard Encoding", "id": "T1132.001" } ], "id": "T1132" } ] } ] } For the first issue, even if we reference the external:ecs definition, we'll still encounter data type mismatch errors. To resolve this, we can add a script in the ingest pipeline to convert such fields (e.g., integers to strings) and then remove these ECS definitions from the fields.yml as definitions will be handled by the dynamic ECS imports.
For the second issue, the structure itself is incompatible. One possible solution is that we'll need to handle these cases in the pipeline as well—likely by flattening or reformatting the data to match the ECS schema.
Let me know your thoughts on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Data type mismatches: Some ECS fields are appearing with a different data type than expected. For example, process.group.id is coming in as an integer in the raw logs, whereas ECS expects it to be a string.
The Elasticsearch data type is the primary concern. The integration reads the event structure from _source, which might not reflect how the data was indexed in the source cluster.
For instance, it's common for fields to be numbers in the JSON _source and stored as keyword types. However, in elastic-package, this causes a validation issue because we prefer _source to match the ES data type (e.g., JSON string for a keyword). To resolve this, you can either apply a convert(type: string) processor or configure elastic-package to ignore numeric fields using numeric_keyword_fields1.
then remove these ECS definitions from the fields.yml as definitions will be handled by the dynamic ECS imports.
Do not remove the static ECS definitions. Keep them. They are generally stronger (because they don't rely on match_mapping_type) and provide documentation.
Structure mismatches: Some fields, like threat.techniques, are coming through as arrays of objects in the raw logs, while ECS expects them to be explicitly defined as group or nested types.
Again, I believe this relates to how the _source is structured versus how the data was actually indexed and how it will be indexed in the target cluster.
If you send the data as-is to an index that contains mappings for the ECS threat fields, it should be mapped correctly. This is because Elasticsearch automatically flattens2 arrays of objects. Did you encounter any errors while trying to use the ECS definitions with this JSON structure?
Footnotes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. @andrewkroh
Will ignore the numeric fields using numeric_keyword_fields and use external: ecs to resolve the data type mismatch issue.
Regarding the second issue on structure mismatch, using the ECS definitions with the provided JSON structure results in an error from elastic-package, as shown in the attached message.
ECS expects the threat.technique fields to be either group or nested but its coming as an array of objects.

There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for sharing the error. That elastic-package validation seems counter productive given that it would force the pipeline to implement something that Elasticsearch does automatically. I don't see any way to disable the check. So I think a comment in the file explaining why the declaration for the threat.* fields exists, and why they are not using external: ecs is in order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you send the data as-is to an index that contains mappings for the ECS threat fields, it should be mapped correctly. This is because Elasticsearch automatically flattens2 arrays of objects.
@andrewkroh, since this automatic flattening results in losing relationships/associations between fields, is it better to make them nested instead and keep the associations for threat fields?
@mohitjha-elastic, if you don't add external: ecs for any ECS fields, please manually add field's description:.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this automatic flattening results in losing relationships/associations between fields, is it better to make them nested instead and keep the associations for threat fields?
I would defer to ECS on this. ECS does not indicate to use nested on the threat, only on the threat.enrichments. So it must be that the associations are not essential (if they are then we need to change ECS).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @andrewkroh and @kcreddy!!
Here is the Issue and the PR.
1. Update descriptions of config parameters. 2. Add saved search in dashboard. 3. Update query parameter in data collection, moved it to the request body. 4. Preserve event.original value from message field. 5. Update readme.
| @@ -0,0 +1,1122 @@ | |||
| { | |||
| "@timestamp": "2060-06-09T13:56:03.205Z", | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data is fetched starting from that interval, and pagination happens based on the @timestamp of the incoming data. Therefore, the starting time should be in the future rather than the past.
If you set @timestamp inside the config.yml to reasonable past dates 2022, 2023, etc. you will still be able to achieve same result because your cursor.last_timestamp is based on those values.
integrations/packages/elastic_security/data_stream/alert/agent/stream/cel.yml.hbs
Line 81 in 602ae72
| optional.of(body.hits.hits.map(e, timestamp(e._source['@timestamp'])).max()) |
packages/elastic_security/data_stream/alert/elasticsearch/ingest_pipeline/default.yml Show resolved Hide resolved
packages/elastic_security/_dev/deploy/docker/docker-compose.yml Outdated Show resolved Hide resolved
packages/elastic_security/data_stream/alert/agent/stream/cel.yml.hbs Outdated Show resolved Hide resolved
packages/elastic_security/data_stream/alert/agent/stream/cel.yml.hbs Outdated Show resolved Hide resolved
packages/elastic_security/data_stream/alert/agent/stream/cel.yml.hbs Outdated Show resolved Hide resolved
packages/elastic_security/data_stream/alert/elasticsearch/ingest_pipeline/default.yml Outdated Show resolved Hide resolved
1. Update readme. 2. Shorted system test data for documentation. 3. Add some safety checks in cel code. 4. Replace pipeline script with remove processor.
💚 Build Succeeded
History
|
|
efd6 left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but please wait for @kcreddy.
kcreddy left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the review comments.
@mohitjha-elastic, can you also confirm the issues discussed via DMs, namely mismatched event.severity, incorrect @timestamp, and duplicate event.id are all fixed now?
@kcreddy Sorry I left that issue unconcluded. I investigated and confirmed that the logs at both the source and destination Elasticsearch instances are identical — same |
kcreddy left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
| Package elastic_security - 0.1.0 containing this change is available at https://epr.elastic.co/package/elastic_security/0.1.0/ |






Proposed commit message
Checklist
changelog.ymlfile.How to test this PR locally
Screenshots
Related Issue