Project

General

Profile

Actions

Bug #21007

closed

Ractor scheduler issue when multiple threads in a ractor

Bug #21007: Ractor scheduler issue when multiple threads in a ractor

Added by luke-gru (Luke Gruber) 10 months ago. Updated 5 months ago.

Status:
Closed
Assignee:
Target version:
-
[ruby-core:120510]

Description

When there are multiple threads in a ractor, these threads can get in a state where they are yielding every 10ms instead of every 100ms.

This occurs because in thread_sched_switch0, which is called by thread_sched_switch, ruby_thread_set_native is called. This function calls
rb_ractor_set_current_ec for the next thread to run, but then when the next thread sets itself up before it runs, it calls rb_ractor_thread_switch,
but since the ec has already been changed, it never sets back th->running_time_us to 0.

The yielding happens every 10ms because a very large value in th->running_time_us is always compared to 100ms so it always yields.

This script takes a very long time due to this issue:

ractors = 5.times.map do |i| Ractor.new(i) do |i0| ts = 4.times.map do Thread.new do counter = 0 while counter < 30_000_000 counter += 1 end end end until ts.none? { |t| t.alive? } $stderr.puts "Ractor #{i0} main thread sleeping" sleep 1 end ts.each(&:join) $stderr.puts "Ractor #{i0} done" end end while ractors.any? r, obj = Ractor.select *ractors ractors.delete(r) end 

The fix is to set next_th->running_time_us back to 0 in thread_sched_switch0.

Updated by luke-gru (Luke Gruber) 10 months ago ยท Edited Actions #1 [ruby-core:120511]

PR here: https://github.com/ruby/ruby/pull/12521

Edit: This is getting fixed by a separate PR because someone else noticed this issue too.

That PR is here: https://github.com/ruby/ruby/pull/12094 and should land soon (hopefully).

Updated by jhawthorn (John Hawthorn) 6 months ago Actions #2

  • Assignee set to ractor

Updated by hsbt (Hiroshi SHIBATA) 6 months ago Actions #3

  • Status changed from Open to Assigned

Updated by jhawthorn (John Hawthorn) 5 months ago Actions #4 [ruby-core:122466]

  • Status changed from Assigned to Closed

This was fixed by resetting the running_time_us

Actions

Also available in: PDF Atom