Skip to content

Conversation

ujjawal-khare-27
Copy link

@ujjawal-khare-27 ujjawal-khare-27 commented Oct 8, 2025

Why are these changes needed?

Added a param WaitingTtlSeconds to Ray job. WaitingTtlSeconds is the TTL to mark RayJob as failed when it is waiting to be scheduled.

Related issue number

Closes #4037

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(
Signed-off-by: ujjawal-khare <khareu460@gmail.com>
ujjawal-khare-27 and others added 5 commits October 8, 2025 21:52
Signed-off-by: ujjawal-khare <khareu460@gmail.com>
Signed-off-by: ujjawal-khare <khareu460@gmail.com>
Signed-off-by: ujjawal-khare <khareu460@gmail.com>
Signed-off-by: ujjawal-khare <khareu460@gmail.com>
Copy link
Member

@Future-Outlier Future-Outlier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I remembered that this PR is not going to be added,
cc @EagleLo do you remember the reason?

@kevin85421
Copy link
Member

mark RayJob as failed when it is waiting to be scheduled.

Schedulers (ex: Volcano, YuniKorn, Kueue, scheduler-plugins, KAI scheduler ... etc) should ensure the gang scheduling.

@EagleLo
Copy link
Contributor

EagleLo commented Oct 15, 2025

@Future-Outlier Last time we discussed that since this applies to InteractiveMode, it's up to the user to decide when to stop or clean up the job. So we decided not to enforce a hard cutoff time in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

4 participants