- Notifications
You must be signed in to change notification settings - Fork 119
Fix internal errors from version queries: Part 1 #3809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix internal errors from version queries: Part 1 #3809
Conversation
This begins the work necessary to fix internal errors related to `VersionValue` queries. The ultimate goal is to remove `VersionValue`, but for now, what this does is it modifies the way the various operators that sit on top of record fetches and has them copy out the version into a field, called `__ROW_VERSION`. Once that's done, then in a future version (say, 4.9), we'll be able to modify the `PlanGenerator` so that it references that field. Note that we'll need to wait for that rather than do it immediately for backwards compatibility reasons. If we took the final version right away, then any plan that wants the version when sent to an older version that doesn't do this copying will be unable to process that `FieldValue`. This is in furtherance of FoundationDB#3796.
...rc/main/java/com/apple/foundationdb/record/metadata/expressions/RecordTypeKeyExpression.java Show resolved Hide resolved
.../java/com/apple/foundationdb/record/query/plan/cascades/WindowedIndexScanMatchCandidate.java Outdated Show resolved Hide resolved
...-core/src/main/java/com/apple/foundationdb/relational/recordlayer/query/LogicalOperator.java Show resolved Hide resolved
| return ImmutableSet.of(p.getResultType()); | ||
| final Type resultType = p.getResultType(); | ||
| if (!resultType.isPrimitive() && !resultType.isUuid()) { | ||
| return ImmutableSet.of(resultType); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did this really return a type even if the type is not a record? I guess something using the set that is returned here would then filter it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow. This returns the type only if it's a record or an enum. Primitive types (and UUIDs) get filtered out. This is necessary so that the type repository has the type when it goes to copy it. It also fixed #3734
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method only really needs to return types that are Type.Record. Before your change, it seems it added primitive types as well which made me think about what happens here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. I believe that before, because it was basing its choice on whether the type was a CreatesDynamicTypesValue, which was mostly RecordConstructorValue and AbstractArrayConstructorValue. So, it would have returned a primitive if that was somehow the return type of one of those (like, say, PromoteValue). That no longer works as more things require dynamic types. No one is actually using that marker interface any more, so it could be removed, I suppose.
...ain/java/com/apple/foundationdb/record/query/plan/cascades/AggregateIndexMatchCandidate.java Show resolved Hide resolved
...core/src/main/java/com/apple/foundationdb/record/query/plan/cascades/IndexExpansionInfo.java Outdated Show resolved Hide resolved
.../src/main/java/com/apple/foundationdb/record/query/plan/plans/RecordQueryLoadByKeysPlan.java Show resolved Hide resolved
| | ||
| @Nonnull | ||
| private static Map<String, Descriptors.FieldDescriptor> getFieldDescriptorMap(@Nonnull final Stream<RecordType> recordTypeStream) { | ||
| public Type.Record getPlannerType(@Nonnull Collection<String> recordTypeNames) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we cache this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice (or at least, it would be nice to cache the single-type variant returned by getPlannerType(String)). My initial version of this constructed these structures during RecordMetaData.build and then just retrieves it from a Map. However, the descriptor to Type.Record logic cannot handle cyclic type relationships (e.g., a type X with a field of type Y which has a field of type X), and there are (it turns out) a bunch of tests in our code base with that kind of structure.
We could make this a memoized thing, and then we'd avoid re-creating the same object multiple times. We weren't doing that before, though, in the old code path, so this is at least treading water.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, the descriptor to Type.Record logic cannot handle cyclic type relationships (e.g., a type X with a field of type Y which has a field of type X), and there are (it turns out) a bunch of tests in our code base with that kind of structure.
How is that possible? Type is immutable so how can you create those cyclic types. Can you point to a test case that does this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some sense, it's the immutability of Type that is the problem. We have test cases like UnnestedRecordTypeTest.unnestDoubleNestedMapType that use test_records_double_nested.proto as the basis of their meta-data definition: https://github.com/FoundationDB/fdb-record-layer/blob/main/fdb-record-layer-core/src/test/proto/test_records_double_nested.proto
Note that the types in that file (like, say MiddleRecord, which has a field of type MiddleRecord) have a cyclic graph. I don't think you can construct a Type.Record that models this correctly, and if you try to, you get a StackOverflowError. FWIW, I think you would always have gotten a stack overflow error if you tried to use cascades to query this type, and trying to do that eagerly just overturned the bug
…CV construction was taking place
…ecord instead of a base Type
…ugh Cascades plans are not yet good
…t coverage of those plan execute methods
1. The normalize fields method was resetting any field numbers if they were already set. This can cause `deepCopy` problems as the message structures may not align if the original types had out of order field numbers 1. Update could get weird. It expects the old and new messages to have the same format, which means that either they both needed to reference pseudo-fields or they both needed to not. However, the way it was structured, we could wind up placing two types with the same name (one with and one without the pseudo-fields) into the type repository. This adjusts this to not re-use the type, but we may have to think about what that means for the result set meta-data
9824821 to 50f02fc Compare 📊 Metrics Diff Analysis ReportSummary
ℹ️ About this analysisThis automated analysis compares query planner metrics between the base branch and this PR. It categorizes changes into:
The last category in particular may indicate planner regressions that should be investigated. New QueriesCount of new queries by file:
Dropped QueriesThe following queries with metrics were removed:
The reviewer should double check that these queries were removed intentionally to avoid a loss of coverage. Plan and Metrics ChangedThese queries experienced both plan and metrics changes. This generally indicates that there was some planner change Total: 21 queries Statistical Summary (Plan and Metrics Changed)
There were no queries with significant regressions detected. Minor Changes (Plan and Metrics Changed)In addition, there were 21 queries with minor changes. Only Metrics ChangedThese queries experienced only metrics changes without any plan changes. If these metrics have substantially changed, Total: 16 queries Statistical Summary (Only Metrics Changed)
Significant Regressions (Only Metrics Changed)There was 1 outlier detected. Outlier queries have a significant regression in at least one field. Statistically, this represents either an increase of more than two standard deviations above the mean or a large absolute increase (e.g., 100).
Minor Changes (Only Metrics Changed)In addition, there were 15 queries with minor changes. |
| Adding |
This begins the work necessary to fix internal errors related to
VersionValuequeries. The ultimate goal is to removeVersionValue, but for now, what this does is it modifies the way the various operators that sit on top of record fetches and has them copy out the version into a field, called__ROW_VERSION. Once that's done, then in a future version (say, 4.9), we'll be able to modify thePlanGeneratorso that it references that field. Note that we'll need to wait for that rather than do it immediately for backwards compatibility reasons. If we took the final version right away, then any plan that wants the version when sent to an older version that doesn't do this copying will be unable to process thatFieldValue.This is in furtherance of #3796.
This also fixes #3734. This is because the type repository updates that are done to make the new copied-to types present in the type repository also put in the enum fields that were previously missing (and causing #3734).