- Notifications
You must be signed in to change notification settings - Fork 604
Closed
Labels
enhancementNew feature or requestNew feature or request
Milestone
Description
Description
Currently, the user must tune an API's CPU request for horizontal pod autoscaling to behave as expected. An approach based on concurrent requests per container may be better (similar to what Knative uses).
This would also make autoscaling for GPU workloads behave more as expected
It may make sense to have both request-based and CPU/GPU-based autoscaling active at the same time, i.e. it will scale when either of the thresholds are met, and won't scale back until both metrics have backed off.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request