- Notifications
You must be signed in to change notification settings - Fork 25.5k
[ML] Fix double-counting of inference memory in the assignment rebalancer #133919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Fix double-counting of inference memory in the assignment rebalancer #133919
Conversation
Hi @valeriy42, I've created a changelog YAML for you. |
Pinging @elastic/ml-core (Team:ML) |
// We subtract native inference memory as the planner expects available memory for | ||
// native inference including current assignments. | ||
getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference(load), | ||
load.getFreeMemoryExcludingPerNodeOverhead(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the method name getFreeMemoryExcludingPerNodeOverhead
.
Shouldn't that just be getFreeMemory()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that those numbers should be the same most of the time.
Maybe there is a corner case where the ML node is used only for inference in Java code, so that we don't account for the 30MB of the native code overhead, and NodeLoad.assignedNativeCodeOverheadMemory
is 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ncer (elastic#133919) The static method TrainedModelAssignmentRebalancer.getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference was used to subtract load.getAssignedNativeInferenceMemory() from load.getFreeMemoryExcludingPerNodeOverhead(). However, in NodeLoad.getFreeMemoryExcludingPerNodeOverhead(), native inference memory was already subtracted as part of the getAssignedJobMemoryExcludingPerNodeOverhead() calculation. This led to double-counting of the native inference memory. Avoiding this double-counting allows us to remove the private method getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference() entirely.
…ncer (elastic#133919) The static method TrainedModelAssignmentRebalancer.getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference was used to subtract load.getAssignedNativeInferenceMemory() from load.getFreeMemoryExcludingPerNodeOverhead(). However, in NodeLoad.getFreeMemoryExcludingPerNodeOverhead(), native inference memory was already subtracted as part of the getAssignedJobMemoryExcludingPerNodeOverhead() calculation. This led to double-counting of the native inference memory. Avoiding this double-counting allows us to remove the private method getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference() entirely.
…ncer (elastic#133919) The static method TrainedModelAssignmentRebalancer.getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference was used to subtract load.getAssignedNativeInferenceMemory() from load.getFreeMemoryExcludingPerNodeOverhead(). However, in NodeLoad.getFreeMemoryExcludingPerNodeOverhead(), native inference memory was already subtracted as part of the getAssignedJobMemoryExcludingPerNodeOverhead() calculation. This led to double-counting of the native inference memory. Avoiding this double-counting allows us to remove the private method getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference() entirely.
…ncer (elastic#133919) The static method TrainedModelAssignmentRebalancer.getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference was used to subtract load.getAssignedNativeInferenceMemory() from load.getFreeMemoryExcludingPerNodeOverhead(). However, in NodeLoad.getFreeMemoryExcludingPerNodeOverhead(), native inference memory was already subtracted as part of the getAssignedJobMemoryExcludingPerNodeOverhead() calculation. This led to double-counting of the native inference memory. Avoiding this double-counting allows us to remove the private method getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference() entirely.
…ncer (#133919) (#134053) The static method TrainedModelAssignmentRebalancer.getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference was used to subtract load.getAssignedNativeInferenceMemory() from load.getFreeMemoryExcludingPerNodeOverhead(). However, in NodeLoad.getFreeMemoryExcludingPerNodeOverhead(), native inference memory was already subtracted as part of the getAssignedJobMemoryExcludingPerNodeOverhead() calculation. This led to double-counting of the native inference memory. Avoiding this double-counting allows us to remove the private method getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference() entirely.
…ncer (#133919) (#134052) The static method TrainedModelAssignmentRebalancer.getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference was used to subtract load.getAssignedNativeInferenceMemory() from load.getFreeMemoryExcludingPerNodeOverhead(). However, in NodeLoad.getFreeMemoryExcludingPerNodeOverhead(), native inference memory was already subtracted as part of the getAssignedJobMemoryExcludingPerNodeOverhead() calculation. This led to double-counting of the native inference memory. Avoiding this double-counting allows us to remove the private method getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference() entirely.
…ncer (#133919) (#134051) The static method TrainedModelAssignmentRebalancer.getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference was used to subtract load.getAssignedNativeInferenceMemory() from load.getFreeMemoryExcludingPerNodeOverhead(). However, in NodeLoad.getFreeMemoryExcludingPerNodeOverhead(), native inference memory was already subtracted as part of the getAssignedJobMemoryExcludingPerNodeOverhead() calculation. This led to double-counting of the native inference memory. Avoiding this double-counting allows us to remove the private method getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference() entirely.
…ncer (#133919) (#134054) The static method TrainedModelAssignmentRebalancer.getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference was used to subtract load.getAssignedNativeInferenceMemory() from load.getFreeMemoryExcludingPerNodeOverhead(). However, in NodeLoad.getFreeMemoryExcludingPerNodeOverhead(), native inference memory was already subtracted as part of the getAssignedJobMemoryExcludingPerNodeOverhead() calculation. This led to double-counting of the native inference memory. Avoiding this double-counting allows us to remove the private method getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference() entirely.
…ncer (elastic#133919) (elastic#134054) The static method TrainedModelAssignmentRebalancer.getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference was used to subtract load.getAssignedNativeInferenceMemory() from load.getFreeMemoryExcludingPerNodeOverhead(). However, in NodeLoad.getFreeMemoryExcludingPerNodeOverhead(), native inference memory was already subtracted as part of the getAssignedJobMemoryExcludingPerNodeOverhead() calculation. This led to double-counting of the native inference memory. Avoiding this double-counting allows us to remove the private method getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference() entirely.
…ncer (elastic#133919) (elastic#134054) The static method TrainedModelAssignmentRebalancer.getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference was used to subtract load.getAssignedNativeInferenceMemory() from load.getFreeMemoryExcludingPerNodeOverhead(). However, in NodeLoad.getFreeMemoryExcludingPerNodeOverhead(), native inference memory was already subtracted as part of the getAssignedJobMemoryExcludingPerNodeOverhead() calculation. This led to double-counting of the native inference memory. Avoiding this double-counting allows us to remove the private method getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference() entirely.
The static method
TrainedModelAssignmentRebalancer.getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference
was used to subtractload.getAssignedNativeInferenceMemory()
fromload.getFreeMemoryExcludingPerNodeOverhead()
. However, inNodeLoad.getFreeMemoryExcludingPerNodeOverhead()
, native inference memory was already subtracted as part of thegetAssignedJobMemoryExcludingPerNodeOverhead()
calculation.This led to double-counting of the native inference memory. Avoiding this double-counting allows us to remove the private method
getNodeFreeMemoryExcludingPerNodeOverheadAndNativeInference()
entirely.