-
Couldn't load subscription status.
- Fork 25.6k
Reuse FieldPermissionsCache in Role parsing #94931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When parsing role descriptors, we ensure that the FieldPermissions (`"field_security":{ "grant":[ ... ], "except":[ ... ] }`) are valid - that is that any patterns compile correctly, and the "except" is a subset of the "grant". However, the previous implementation would not use the FieldPermissionsCache for this, so it would compile (union, intersect & minimize) automatons every time a role was parsed. This was particularly an issue when parsing roles (from the security index) in the GET /_security/role/ endpoint. If there were a large number of roles with field level security the automaton parsing could have significant impact on the performance of this API. | Pinging @elastic/es-security (Team:Security) |
| Hi @tvernum, I've created a changelog YAML for you. |
| if (cache == null) { | ||
| return new FieldPermissions(definition); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't allow this cache to be disabled. So this should never be null. We should either assert that or throw error. Also this constructor is currently only used in tests because that's where cache may not exist. With this new method, it is possible to replace the constructor usages with init and this method and drop this constructor entirely and I think that would be a good thing.
| final AuthorizationService authzService = new AuthorizationService( | ||
| settings, | ||
| allRolesStore, | ||
| fieldPermissionsCache, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to make this change? The cache used by RBACEngine is techinically cross different index groups and the combination depends on the requested indices patterns, e.g. if we have
"indices": [ { "name": ["logs-*"], "privileges": ["read"], "field_security": {"grant": ["xxx"] } }, { "name": ["*-staging"], "privileges": ["read"], "field_security": {"grant": ["yyy"] } }, ] If the requested index pattern is logs-nginx-staging, both groups will match and the cache key will be
new FieldPermissionsDefinition( Set.of( new FieldGrantExcludeGroup(["xxx"], null), new FieldGrantExcludeGroup(["yyy"], null) ) ) While the cache used in CompositeRolesStore is most likely keyed by a single index group. With the above example, the two groups will be cached separately. If we combine the two caches, it is possible that more entries would be competing for cache usages.
Along the same line of reasoning, the cache used by RoleDescriptor is purely for a single index definition. Strictly speaking, it can also be different from the cache usage by CompositeRolesStore which can combine index groups if they have exactly the same names definition.
In summary, the cache is keyed slightly different with the 3 use cases:
- RoleDescriptor - keyed strictly by a single index group's FLS definition
- CompositeRolesStore - keyed by FLS combination of multiple index groups that share the same
namesdefinition - RBACEngine - keyed by FLS combination of mulitple index groups that have
namesmatching the requested index.
That said, I do think having a single cache for them is reasonable because we don't really want to differentiate clearly between the different type of cache keys. My only concern is whether combining them could add extra noticeable pressure on this single cache.
| final FieldPermissionsCache cache = new FieldPermissionsCache(settings); | ||
| INSTANCE.set(cache); | ||
| return cache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels a bit odd to have both a staticly shared instance and a public constructor.
| @ywangd I'll respond to your comments collectively because I think they're centred around 2 issues:
On point 1 we have the issues of:
In both of those cases the reason is that a large number of tests assume the old behaviour of either (1) constructing a cache inside the test or (2) being able to parse a role without configuring a cache. To be honest, I hate the static cache, but Perhaps we need to chat about which big change we'd prefer (tighten up the static cache, or remove the need for a static cache) before we go much further. On the question of sharing caches, my thoughts are:
Aside: Subset Checking: I find the existing subset checking to be a bit weird. The design is that you can have but not However, because it's a subset check rather than an intersection check, you also can't say It is what it is, and I don't want to argue to change it, but it means that I don't think we'd want to push this exact check into |
| For the cache sharing, I think you have some good points on having a single cache. It is more intuitive (a single setting configures a single cache). The existing 2-cache structure also has memory duplication issue. So the change here actually fixes it with a different tradeoff. I think it is reasonable to assume that the memory duplication issue (single index section matching the requested index) is more likely to happen than the issue with cache sharing. Based on these, I am on board with the approach of a single shared cache. As for how much changes we want for this PR, I do think we want to keep it as small as possible for the sake of 7.17 backport. But I also find it is rather non-ideal that it is possible to call private static final AtomicReference<Cache<FieldPermissionsDefinition, FieldPermissions>> CACHE_REF = new AtomicReference<>(); public FieldPermissionsCache(Settings settings) { CACHE_REF.updateAndGet(c -> { if (c == null) { return CacheBuilder.<FieldPermissionsDefinition, FieldPermissions>builder() .setMaximumWeight(CACHE_SIZE_SETTING.get(settings)) .weigher((key, fieldPermissions) -> fieldPermissions.ramBytesUsed()) .build(); } else { return c; } }); } public FieldPermissions getFieldPermissions(FieldPermissionsDefinition fieldPermissionsDefinition) { return CACHE_REF.get().computeIfAbsent(...); }It helps getting rid of the static |
| @ywangd I've updated this based on our conversation. I think it ended up being a much simpler change, though we acknowledge there's some debt to refactor later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I like how small the changeset is!
| } | ||
| | ||
| public void testParseIndicesPrivilegesFailsWhenExceptFieldsAreNotSubsetOfGrantedFields() { | ||
| resetFieldPermssionsCache(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question: Is this reset necessary or nice to have just in case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's no longer necessary. It was needed in an earlier version where sometimes tests ran without a cache, because the behaviour with/without the cache was subtly different.
Now it's just there because it makes sense to have a known clean base for the test (and not have it be at risk of flakiness due to what's existing in the cache)
When parsing role descriptors, we ensure that the FieldPermissions (`"field_security":{ "grant":[ ... ], "except":[ ... ] }`) are valid - that is that any patterns compile correctly, and the "except" is a subset of the "grant". However, the previous implementation would not use the FieldPermissionsCache for this, so it would compile (union, intersect & minimize) automatons every time a role was parsed. This was particularly an issue when parsing roles (from the security index) in the GET /_security/role/ endpoint. If there were a large number of roles with field level security the automaton parsing could have significant impact on the performance of this API. When parsing role descriptors, we ensure that the FieldPermissions (`"field_security":{ "grant":[ ... ], "except":[ ... ] }`) are valid - that is that any patterns compile correctly, and the "except" is a subset of the "grant". However, the previous implementation would not use the FieldPermissionsCache for this, so it would compile (union, intersect & minimize) automatons every time a role was parsed. This was particularly an issue when parsing roles (from the security index) in the GET /_security/role/ endpoint. If there were a large number of roles with field level security the automaton parsing could have significant impact on the performance of this API. Backport of: elastic#94931 * Reuse FieldPermissionsCache in Role parsing (#94931) When parsing role descriptors, we ensure that the FieldPermissions (`"field_security":{ "grant":[ ... ], "except":[ ... ] }`) are valid - that is that any patterns compile correctly, and the "except" is a subset of the "grant". However, the previous implementation would not use the FieldPermissionsCache for this, so it would compile (union, intersect & minimize) automatons every time a role was parsed. This was particularly an issue when parsing roles (from the security index) in the GET /_security/role/ endpoint. If there were a large number of roles with field level security the automaton parsing could have significant impact on the performance of this API. * Fix spotless and imports
When parsing role descriptors, we ensure that the FieldPermissions (`"field_security":{ "grant":[ ... ], "except":[ ... ] }`) are valid - that is that any patterns compile correctly, and the "except" is a subset of the "grant". However, the previous implementation would not use the FieldPermissionsCache for this, so it would compile (union, intersect & minimize) automatons every time a role was parsed. This was particularly an issue when parsing roles (from the security index) in the GET /_security/role/ endpoint. If there were a large number of roles with field level security the automaton parsing could have significant impact on the performance of this API. Backport of: #94931
When parsing role descriptors, we ensure that the FieldPermissions (
"field_security":{ "grant":[ ... ], "except":[ ... ] }) are valid - that is that any patterns compile correctly, and the "except" is a subset of the "grant".However, the previous implementation would not use the FieldPermissionsCache for this, so it would compile (union, intersect & minimize) automatons every time a role was parsed.
This was particularly an issue when parsing roles (from the security index) in the GET /_security/role/ endpoint. If there were a large number of roles with field level security the automaton parsing could have significant impact on the performance of this API.