- Notifications
You must be signed in to change notification settings - Fork 25.5k
Refactor IngestDocument's resolve method #125051
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor IngestDocument's resolve method #125051
Conversation
There's no need to construct and return a ResolveResult in order to do the looping externally. Instead we can do the looping within resolve and only construct and return a ResolveResult on the last iteration of the loop.
Although an IngestCtxMap is a map, having a separate slot in the if/else chain for it gives the JVM the opportunity to optimize the IngestCtxMap and almost-certainly-a-HashMap cases separately, rather than needing to treat both of them the same.
Pinging @elastic/es-data-management (Team:Data Management) |
This is tangentially related to #123891 -- more efficient |
ResolveResult result = resolve(fieldPath.pathElements, fieldPath.pathElements.length, path, context); | ||
if (result.wasSuccessful) { | ||
return cast(path, result.resolvedObject, clazz); | ||
} else if (ignoreMissing && hasField(path) == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think hasField(path) == false
is no longer needed. I took a look at the commit that introduced it 6796464f
, and the only reason I can think that it would have failed this check would be if the path resolved, but the type of the resolved object was wrong. I believe that now, to get to the else-if condition, the path cannot be in the object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, actually -- I think the code was wrong before this PR and maybe it's just a little more obvious now with this change. #20982 introduced it and would have been 'right', but #84659 would have changed those semantics, I think.
I'm going to see if I can magic up a test that would have been broken by this change just to confirm my thinking, but either way I'll get rid of the hasField
check here.
Good eyes, Parker! Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's actually going to be a nice tiny little speedup, too, btw -- there are places in the benchmarks where that hasField(path)
check was hot enough to show up in a profile.
💚 Backport successful
|
![]() Here's a screenshot from the nightly benchmarks -- there's a big drop in |
Rather than looping externally and calling the method multiple times, we can loop internally and only call the method once. This is primarily intended as a code cleanup and clearness change, so I went with
>refactoring
.As a small additional optimization, I split the
instanceof Map
checks to first run aninstanceof IngestCtxMap
check -- these maps areIngestCtxMap
at the top level of a path traversal, and then almost exclusivelyHashMap
s at subsequent levels. By having separate paths, the JVM has the opportunity to learn that those concrete types are hot in each place, rather than them both being mixed up with each other (or at least that's my understanding).