Skip to content

Conversation

@mrodm
Copy link
Contributor

@mrodm mrodm commented Mar 5, 2025

This PR adds the same exception for leaf of objects that was set in place for validation based on fields found in documents:

if skipLeafOfObject(root, name, v.specVersion, v.Schema) {

These validations are based on the spec version defined in the package.

Found issue in these builds for gcp and network_traffic where those missing definitions should be ignored since those packages define spec version 3.0.0

https://buildkite.com/elastic/integrations/builds/23010

Buildkite build testing this change with all packages: https://buildkite.com/elastic/integrations/builds/23032

Buildkite build testing with stack 9.1.0-SNAPSHOT with this change:
https://buildkite.com/elastic/integrations/builds/23034

How to test locally this PR

cd path/to/integrations elastic-package stack up -v -d --version 9.1.0-SNAPSHOT cd packages/gcp elastic-package test system -v --data-streams firewall cd ../packages/network_traffic elastic-package test system -v --data-streams dns elastic-package stack down -v
@mrodm mrodm self-assigned this Mar 5, 2025
@mrodm
Copy link
Contributor Author

mrodm commented Mar 5, 2025

test integrations

@elastic-vault-github-plugin-prod

Created or updated PR in integrations repository to test this version. Check elastic/integrations#12965

@mrodm mrodm marked this pull request as ready for review March 5, 2025 11:34
@mrodm mrodm requested a review from a team March 5, 2025 11:34
all = append(all, fields...)
default:
if skipLeafOfObject(root, name, v.specVersion, v.Schema) {
logger.Warnf("Skip validating leaf of object (key %q spec %q)", key, v.specVersion)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove these kind of messages? Do you think they are adding noise to the output ?

And, should these other warnings be removed too?

logger.Warnf("Found exception field, skip its validation (not present in preview): %q", fieldPath)

logger.Debugf("Skip field which value is an empty object: %q", fieldPath)

logger.Warnf("Found exception field, skip its validation: %q", path)

logger.Warnf("Skip validating field of type array with spec < 2.0.0 (key %q type %q spec %q)", key, definition.Type, v.specVersion)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is probably too verbose, maybe even at the debug level.

If we want to have this at the debug level, we could print the full list after collecting it instead of once per element.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new TRACE log level here fc6c793 if more than -v is set in the command line.

This log level should not be enabled in CI.

Probably, this would also help in other scenarios.

WDYT ? @jsoriano

@mrodm mrodm requested a review from jsoriano March 5, 2025 13:06
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to add this trace log level.

all = append(all, fields...)
default:
if skipLeafOfObject(root, name, v.specVersion, v.Schema) {
logger.Tracef("Skip validating leaf of object (key %q spec %q)", key, v.specVersion)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we are keeping the list of fields we could still log the full list after the loop instead of logging all of them. Then it could still be at the debug level.

logger.Debugf("Skip validating leaves of object %q (spec %q): %s", root, v.specVersion, strings.Join(all, ",")) 
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll add that.

Testing it in gcp package, this would be the output:

2025/03/05 14:51:56 DEBUG Performing validation based on mappings 2025/03/05 14:51:56 DEBUG Skip validating leafs of object "" (spec "3.0.0"): gcp.source.vpc.vpc_name,source.domain,agent.type,log.logger,network.direction,data_stream.namespace,gcp.destination.instance.zone,gcp.source.vpc.subnetwork_name,agent.name,network.type,cloud.project.id,event.type,rule.name,network.name.text,rule.name.text,gcp.source.vpc.project_id,source.port,destination.address,event.kind,event.module,destination.ip,agent.ephemeral_id,event.id,event.category,gcp.destination.instance.project_id,gcp.source.instance.project_id,network.community_id,event.created,network.iana_number,gcp.destination.vpc.subnetwork_name,gcp.destination.vpc.project_id,cloud.availability_zone,network.name,gcp.firewall.rule_details.direction,gcp.firewall.rule_details.ip_port_info,gcp.firewall.rule_details.priority,gcp.source.instance.region,gcp.source.instance.zone,destination.port,cloud.provider,agent.version,data_stream.dataset,data_stream.type,source.address,gcp.firewall.rule_details.source_range,event.agent_id_status,gcp.destination.vpc.vpc_name,event.action,event.ingested,event.dataset,agent.name.text,destination.domain,gcp.destination.instance.region,source.ip,agent.id,ecs.version,gcp.firewall.rule_details.target_tag,gcp.firewall.rule_details.action,network.transport,cloud.region,related.ip 2025/03/05 14:51:56 DEBUG Skip validating leafs of object "" (spec "3.0.0"): agent.type,related.ip,gcp.destination.vpc.subnetwork_name,gcp.source.vpc.subnetwork_name,network.name.text,gcp.destination.vpc.vpc_name,event.dataset,gcp.firewall.rule_details.target_tag,source.address,event.ingested,cloud.region,agent.version,event.agent_id_status,source.port,destination.port,destination.ip,agent.ephemeral_id,rule.name,log.logger,gcp.source.instance.project_id,source.domain,network.type,gcp.destination.instance.zone,gcp.destination.instance.project_id,ecs.version,event.created,cloud.project.id,event.kind,network.community_id,cloud.provider,rule.name.text,network.name,event.action,agent.name,data_stream.type,cloud.availability_zone,destination.domain,gcp.firewall.rule_details.ip_port_info,network.iana_number,network.transport,source.ip,gcp.source.vpc.vpc_name,gcp.firewall.rule_details.action,gcp.firewall.rule_details.direction,event.type,network.direction,gcp.source.vpc.project_id,data_stream.dataset,gcp.source.instance.zone,event.module,data_stream.namespace,gcp.destination.vpc.project_id,agent.id,gcp.firewall.rule_details.priority,destination.address,agent.name.text,gcp.firewall.rule_details.source_range,gcp.destination.instance.region,gcp.source.instance.region,event.id,event.category 2025/03/05 14:51:56 DEBUG Skip validating leafs of object "" (spec "3.0.0"): cloud.availability_zone,source.domain,destination.as.number,source.ip,gcp.firewall.rule_details.priority,source.address,agent.ephemeral_id,event.category,network.name.text,destination.geo.country_name,data_stream.namespace,destination.geo.country_iso_code,agent.name.text,cloud.region,network.iana_number,event.id,gcp.source.instance.region,network.transport,event.dataset,gcp.source.instance.zone,agent.type,network.type,destination.ip,data_stream.dataset,gcp.source.instance.project_id,gcp.firewall.rule_details.target_tag,agent.id,event.module,destination.geo.continent_name,gcp.firewall.rule_details.action,data_stream.type,gcp.firewall.rule_details.ip_port_info,destination.port,log.logger,gcp.firewall.rule_details.destination_range,event.type,ecs.version,gcp.source.vpc.vpc_name,gcp.source.vpc.subnetwork_name,destination.address,event.agent_id_status,destination.geo.location,network.direction,event.action,event.ingested,rule.name,cloud.provider,event.created,agent.version,network.name,agent.name,gcp.firewall.rule_details.direction,cloud.project.id,network.community_id,event.kind,source.port,related.ip,rule.name.text,gcp.source.vpc.project_id 2025/03/05 14:51:56 DEBUG Get Mappings from data stream (logs-gcp.firewall-57633) 

Probably with packages with more documents ingested (there would be one line per document) and more tests this could be too much.
I think it would be better to just set it as TRACE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized that the all variable contains more fields that just the ones found by skipLeafOfObject.

So I'd keep the same message as Trace, but add a new Debug message with all the exception fields found (the ones that are going to be skipped):

2025/03/05 15:39:25 DEBUG Fields to be skipped validation: agent.ephemeral_id,agent.id,agent.name,agent.name.text,agent.type,agent.version,cloud.availability_zone,cloud.project.id,cloud.provider,cloud.region,data_stream.dataset,data_stream.namespace,data_stream.type,destination.address,destination.domain,destination.ip,destination.port,ecs.version,event.action,event.agent_id_status,event.category,event.created,event.dataset,event.id,event.ingested,event.kind,event.module,event.type,gcp.destination.instance.project_id,gcp.destination.instance.region,gcp.destination.instance.zone,gcp.destination.vpc.project_id,gcp.destination.vpc.subnetwork_name,gcp.destination.vpc.vpc_name,gcp.firewall.rule_details.action,gcp.firewall.rule_details.direction,gcp.firewall.rule_details.ip_port_info,gcp.firewall.rule_details.priority,gcp.firewall.rule_details.source_range,gcp.firewall.rule_details.target_tag,gcp.source.instance.project_id,gcp.source.instance.region,gcp.source.instance.zone,gcp.source.vpc.project_id,gcp.source.vpc.subnetwork_name,gcp.source.vpc.vpc_name,log.logger,network.community_id,network.direction,network.iana_number,network.name,network.name.text,network.transport,network.type,related.ip,rule.name,rule.name.text,source.address,source.domain,source.ip,source.port,destination.as.number,destination.geo.continent_name,destination.geo.country_iso_code,destination.geo.country_name,destination.geo.location,gcp.firewall.rule_details.destination_range 2025/03/05 15:39:25 DEBUG Get Mappings from data stream (logs-gcp.firewall-25845) 
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that there could be quite a few elements in that list.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you are right, too many things there. So as you prefer.

Copy link
Contributor Author

@mrodm mrodm Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll set it then as Trace that message with the full list of exception fields. 812f88d

}
}

logger.Debugf("Fields to be skipped validation: %s", strings.Join(allFields, ","))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New message added to show all the exception fields found in all documents (as debug level).

Not sure if this could be even too verbose.

@elasticmachine
Copy link
Collaborator

elasticmachine commented Mar 5, 2025

💛 Build succeeded, but was flaky

Failed CI Steps

History

cc @mrodm

@mrodm mrodm merged commit 183024e into elastic:main Mar 5, 2025
3 checks passed
@mrodm mrodm deleted the add-exception-leaf-object-mappings branch April 11, 2025 10:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants