-
Couldn't load subscription status.
- Fork 41
Allow dynamic window multiplier via extension #833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Also fixes an irregular behavior in dynamic rate calculation to account for hits.
| // Only record the incoming hits when the current rate is within the allowed | ||
| // range, otherwise, do not record the hits and return the calculated rate. | ||
| // The idea is to continuously increase the rate limit. MaxAllowed sets a | ||
| // ceiling on it with the window duration. | ||
| // MaxAllowed sets a ceiling on the rate with the window duration. | ||
| // | ||
| // NOTE(marclop) We may want to add a follow-up static ceiling to avoid | ||
| // unbounded growth. | ||
| maxAllowed := math.Max(staticRate, previous*drc.WindowMultiplier) | ||
| maxAllowed := math.Max(staticRate, previous*windowMultiplier) | ||
| // Normalise the current rate assuming no more events will occur during the | ||
| // rest of the window. This will ensure that we record hits based on the | ||
| // currently observed hits and NOT based on extrapolated data. | ||
| current = current * drc.elapsed.Seconds() / drc.WindowDuration.Seconds() | ||
| if current <= maxAllowed { | ||
| if err := r.recordHits(ctx, drc, hits); err != nil { | ||
| // Deduce how many hits to record to reach to the max allowed number | ||
| remainingHits := int((maxAllowed - current) * drc.WindowDuration.Seconds()) | ||
| if err := r.recordHits(ctx, drc, min(hits, remainingHits)); err != nil { | ||
| return -1, err | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[For reviewers] I have updated the logic to calculate the upper bound and record hits for the current window. Previously, the current rate was calculated on the elapsed window - this meant that if we have a spike then the current rate would be high but as the evaluation period gets towards completion the current rate will decrease. Since the older logic updated hits for the current period only when the current rate was lower than max allowed, this could lead to the previous window always being empty (or low) in certain cases.
The logic here gets the current rate without accounting for any hits from the current request. This current rate is normalized over the full evaluation window to determine if more hits need to be recorded for the window or not. If more hits need to be recorded, then we calculate how many hits should be recorded to reach the max allowed value and only record that many hits.
For the normalization of current to the full window duration, we can also update the logic in peekRates to return this value; however, I chose to keep the peekRates interface returning the rate observed over the elapsed period since it felt more apt for this method. That being said, I don't have a strong preference here and am happy to change this.
| if w, ok := r.windowConfigurator.(component.Component); ok { | ||
| if err := w.Shutdown(ctx); err != nil { | ||
| return fmt.Errorf("failed to shutdown window configurator: %w", err) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to shut it down? Extensions should be shut down independently by the collector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is what I thought too but the processor was already shutting down the class resolver extension and I followed suit. I will take a look and fix both - probably in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codewise LGTM. Lots of flexibility with this change, good work!
Can you update the README.md to reflect these changes please?
Done! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
The PR does 2 things:
windowConfigurator. The extension, if configured, will provide dynamic rate limiting with dynamic multipliers for scaling up or down the effective ingest rate. For example: if the consumer lag for a deployment is high then it will slow down ingest by providing a<1multiplier. Currently, the lowest possible rate that the rate limiter can go it is the static rate defined in the rate limiter configurations but this is up for discussion and a follow up PR can be created to address this.Related to: https://github.com/elastic/hosted-otel-collector/issues/1498