Bug #16809
closedFiber crashes with --with-coroutine=copy
Description
./revision.h unchanged #190 test_fiber.rb:15:in `<top (required)>': Fiber.new{ }.resume :ok #=> "" (expected "ok") #192 test_fiber.rb:26:in `<top (required)>': fibers = 100.times.collect{Fiber.new{Fiber.yield}} fibers.each(&:resume) fibers.each(&:resume) :ok #=> "" (expected "ok") #193 test_fiber.rb:33:in `<top (required)>': at_exit { Fiber.new{}.resume } #=> killed by SIGFPE (signal 8) #194 test_fiber.rb:37:in `<top (required)>': Fiber.new(&Object.method(:class_eval)).resume("foo") #=> killed by SIGFPE (signal 8) [ruby-dev:34128] test_fiber.rb FAIL 4/5 #934 test_massign.rb:165:in `<top (required)>': a,s=[],"aaa" 300.times { a<<s; s=s.succ } eval <<-END__ GC.stress=true Fiber.new do #{ a.join(",") },*zzz=1 end.resume END__ :ok #=> "" (expected "ok") [ruby-dev:32581] test_massign.rb FAIL 1/34 #1391 test_thread.rb:310:in `<top (required)>': g = enum_for(:local_variables) loop { g.next } #=> killed by SIGFPE (signal 8) [ruby-dev:34128] #1392 test_thread.rb:315:in `<top (required)>': g = enum_for(:block_given?) loop { g.next } #=> killed by SIGFPE (signal 8) [ruby-dev:34128] #1393 test_thread.rb:320:in `<top (required)>': g = enum_for(:binding) loop { g.next } #=> killed by SIGFPE (signal 8) [ruby-dev:34128] #1394 test_thread.rb:325:in `<top (required)>': g = "abc".enum_for(:scan, /./) loop { g.next } #=> killed by SIGFPE (signal 8) [ruby-dev:34128] #1395 test_thread.rb:330:in `<top (required)>': g = Module.enum_for(:new) loop { g.next } #=> killed by SIGFPE (signal 8) [ruby-dev:34128] test_thread.rb FAIL 5/48 Thread count: 10000 (skipping) FAIL 10/1409 tests failed make: *** [uncommon.mk:751: yes-btest-ruby] Error 1 May be related to this warning:
compiling coroutine/copy/Context.c coroutine/copy/Context.c: In function 'coroutine_restore_stack_padded': coroutine/copy/Context.c:87:34: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] 87 | _longjmp(context->state, 1 | (int)buffer); |
Updated by puchuu (Andrew Aladjev) about 5 years ago
I've tested copy coroutine. Unfortunately today it is broken completely: hangs, segfaults, etc.
Updated by jeremyevans0 (Jeremy Evans) about 5 years ago
- Subject changed from ruby testsuite fails on s390x alpine (musl) with --with-coroutine=copy to Fiber crashes with --with-coroutine=copy
- Status changed from Open to Assigned
- Assignee set to ioquatix (Samuel Williams)
OpenBSD/sparc64 (which uses copy coroutine) is similarly broken in regards to fibers. Even something simple like ruby27 -e 'Fiber.new{Fiber.yield}.resume' crashes (ruby26 works fine for this). Changing the title to be more general since this does not just affect s390x alpine (musl).
Updated by jeremyevans0 (Jeremy Evans) about 5 years ago
It looks like sometimes the copy coroutine implementation can segfault even on x86_64: https://travis-ci.org/github/ruby/ruby/jobs/729643639
Updated by ioquatix (Samuel Williams) about 5 years ago
This might be a pointer alignment issue / problem with the alloca elision.
After playing around with godbolt compiler explorer, I think this might be one option:
https://github.com/ruby/ruby/pull/3624
However, I wouldn't be surprised if it doesn't solve the issue.
Updated by jeremyevans0 (Jeremy Evans) about 5 years ago
I tried pull request #3624 on OpenBSD/sparc64 and it still crashed.
I was able to come up with a fix that works on OpenBSD/sparc64, as long as a couple files are compiled without optimization: https://github.com/ruby/ruby/pull/3726
Updated by ioquatix (Samuel Williams) about 5 years ago
I think we found the root cause of this, and it should be addressed by:
https://github.com/ruby/ruby/pull/3624/commits/9de559acc82a28bb0d912ed55cd36cf6f652ea9f
However, @jeremyevans0 (Jeremy Evans) is still testing it.
Updated by ioquatix (Samuel Williams) almost 5 years ago
- Status changed from Assigned to Closed
This was fixed in https://github.com/ruby/ruby/pull/3624
Updated by ioquatix (Samuel Williams) almost 5 years ago
- Backport changed from 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN to 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: REQUIRED
@jeremyevans0 (Jeremy Evans) can you manage the backport? Or who is responsible?
This commit (and only this commit) should be backported: https://github.com/ruby/ruby/pull/3624/commits/440983fa9e7695d83def190e9701b5a22e076495
Updated by jeremyevans0 (Jeremy Evans) almost 5 years ago
- Backport changed from 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: REQUIRED to 2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED
ioquatix (Samuel Williams) wrote in #note-8:
@jeremyevans0 (Jeremy Evans) can you manage the backport? Or who is responsible?
The branch maintainer is responsible. For 2.7, that is currently @nagachika (Tomoyuki Chikanaga) .
I updated the backport flag to indicate this is only needed by 2.7 and not earlier versions.
Updated by nagachika (Tomoyuki Chikanaga) over 4 years ago
- Backport changed from 2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED to 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE
ruby_2_7 d84cc717020be1da7d89b6bda02d1427f9593968 merged revision(s) 15e23312f6abcbf1afc6fbbf7917a57a0637f680.