Skip to content

Conversation

@jonathan-buttner
Copy link
Contributor

@jonathan-buttner jonathan-buttner commented Oct 27, 2025

WIP

This PR is just to show the changes I made to be able to test the issue here: #137134

To make the reproduction faster I temporarily changed the code to allow the times to be shorter:

PUT /_cluster/settings { "persistent": { "xpack.ml.trained_models.adaptive_allocations.scale_to_zero_time": "10s", "xpack.ml.trained_models.adaptive_allocations.scale_up_cooldown_time": "10s", "logger.org.elasticsearch.xpack.ml.inference.assignment": "DEBUG" } } 

Then we can follow the steps in the issue to reproduce, which are:

  1. Create deployment via creating inference endpoint
PUT _inference/rerank/mytest-old { "service": "elasticsearch", "service_settings": { "num_threads": 1, "model_id": ".rerank-v1", "adaptive_allocations": { "enabled": true, "min_number_of_allocations": 0, "max_number_of_allocations": 2 } } } 
  1. Wait for mytest-old to scale to zero ~10 seconds
GET _ml/trained_models/_stats 
  1. Create a new deployment via inference endpoint, mytest-old should still exist, but it will have an allocation which is not intended.
PUT _inference/rerank/mytest-new3 { "service": "elasticsearch", "service_settings": { "num_threads": 1, "model_id": ".rerank-v1", "adaptive_allocations": { "enabled": true, "min_number_of_allocations": 0, "max_number_of_allocations": 2 } } } 
GET _ml/trained_models/_stats 
@elasticsearchmachine
Copy link
Collaborator

Hi @jonathan-buttner, I've created a changelog YAML for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :ml Machine learning Team:ML Meta label for the ML team v9.3.0

2 participants