Project

General

Profile

Actions

Bug #17618

closed

Exceptions in Fiber Scheduler causes a segv

Bug #17618: Exceptions in Fiber Scheduler causes a segv

Added by tenderlovemaking (Aaron Patterson) almost 5 years ago. Updated over 4 years ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 3.1.0dev (2021-02-09T13:22:37Z master e7a831de8e) [x86_64-darwin20]
[ruby-core:102429]

Description

If the fiber scheduler doesn't define an unblock function, Ruby will segv when threads are joined.

Here is an example program:

class Scheduler def block blocker, timeout = nil end def fiber &block fiber = Fiber.new blocking: false, &block fiber.resume fiber end end Fiber.set_scheduler Scheduler.new Fiber.schedule do Thread.new { }.join end 

The backtrace looks like this:

(lldb) bt * thread #3, name = 'test.rb:17', stop reason = EXC_BAD_ACCESS (code=1, address=0xb0) frame #0: 0x00000001000dc49a miniruby`rb_ec_tag_jump(ec=0x0000000100a2ec50, st=RUBY_TAG_RAISE) at eval_intern.h:185:20 frame #1: 0x00000001000dbda7 miniruby`rb_longjmp(ec=0x0000000100a2ec50, tag=6, mesg=0x000000010101b3f8, cause=0x0000000000000008) at eval.c:699:5 frame #2: 0x00000001000dbb9c miniruby`rb_exc_raise(mesg=0x000000010101b3f8) at eval.c:717:5 frame #3: 0x000000010037446c miniruby`raise_method_missing(ec=0x0000000100a2ec50, argc=3, argv=0x000070000e6d39e0, obj=0x000000010101b8d0, last_call_status=MISSING_MISSING) at vm_eval.c:955:2 frame #4: 0x0000000100374288 miniruby`method_missing(ec=0x0000000100a2ec50, obj=0x000000010101b8d0, id=24721, argc=3, argv=0x000070000e6d39e0, call_status=MISSING_NOENTRY, kw_splat=0) at vm_eval.c:1002:5 frame #5: 0x0000000100385fdd miniruby`rb_call0(ec=0x0000000100a2ec50, recv=0x000000010101b8d0, mid=24721, argc=2, argv=0x000070000e6d3be0, call_scope=CALL_FCALL, self=0x0000000000000008) at vm_eval.c:515:20 frame #6: 0x0000000100358a02 miniruby`rb_funcallv_scope(recv=0x000000010101b8d0, mid=24721, argc=2, argv=0x000070000e6d3be0, scope=CALL_FCALL) at vm_eval.c:1021:16 frame #7: 0x0000000100354c71 miniruby`rb_funcallv(recv=0x000000010101b8d0, mid=24721, argc=2, argv=0x000070000e6d3be0) at vm_eval.c:1038:12 frame #8: 0x000000010035921d miniruby`rb_funcall(recv=0x000000010101b8d0, mid=24721, n=2) at vm_eval.c:1109:12 * frame #9: 0x0000000100291d23 miniruby`rb_fiber_scheduler_unblock(scheduler=0x000000010101b8d0, blocker=0x000000010107bd70, fiber=0x000000010101b768) at scheduler.c:142:12 frame #10: 0x00000001002f1445 miniruby`rb_threadptr_join_list_wakeup(thread=0x0000000100a2e9b0) at thread.c:555:13 frame #11: 0x00000001002f0fd5 miniruby`thread_start_func_2(th=0x0000000100a2e9b0, stack_start=0x000070000e7d3f70) at thread.c:891:9 frame #12: 0x00000001002f07b5 miniruby`thread_start_func_1(th_ptr=0x0000000100a2e9b0) at thread_pthread.c:1033:9 frame #13: 0x00007fff2043a950 libsystem_pthread.dylib`_pthread_start + 224 frame #14: 0x00007fff2043647b libsystem_pthread.dylib`thread_start + 15 

It seems like the ec is missing a tag:

(lldb) f 0 frame #0: 0x00000001000dc49a miniruby`rb_ec_tag_jump(ec=0x0000000100a2ec50, st=RUBY_TAG_RAISE) at eval_intern.h:185:20 182	static inline void 183	rb_ec_tag_jump(const rb_execution_context_t *ec, enum ruby_tag_type st) 184	{ -> 185 ec->tag->state = st; 186 ruby_longjmp(ec->tag->buf, 1); 187	} 188 (lldb) p ec->tag (rb_vm_tag *const) $1 = 0x0000000000000000 (lldb) 

I tried popping the tag later in thread_start_func_2, but it caused the process to go in to an infinite loop.

Updated by alanwu (Alan Wu) almost 5 years ago Actions #1 [ruby-core:102431]

Just some observations in case it's useful. Implementing unblock in the scheduler and printing out the current thread shows that unblock runs on a dead thread:

class Scheduler def block blocker, timeout = nil end def unblock a, b p Thread.current end def fiber &block fiber = Fiber.new blocking: false, &block fiber.resume fiber end end Fiber.set_scheduler Scheduler.new Fiber.schedule do Thread.new { }.join end 
ruby 3.1.0dev (2021-02-09T22:47:36Z master 49d3830f44) [x86_64-darwin19] #<Thread:0x00007fee4d81b490 test.rb:20 dead> 

It doesn't seem right to run Ruby code on a dead thread.

Also, raising any exception in the unblock method will cause a SEGV. For example:

class Scheduler def block blocker, timeout = nil end def unblock a, b raise end def fiber &block fiber = Fiber.new blocking: false, &block fiber.resume fiber end end Fiber.set_scheduler Scheduler.new Fiber.schedule do Thread.new { }.join end 

Updated by ioquatix (Samuel Williams) over 4 years ago Actions #2

My initial reaction is a scheduler without unblock is broken by design, and it's the dead thread which is invoking unblock as part of it's tidy up - which in other cases will wake up other threads. I don't have any strong opinion about it, except that a thread that transitions to dead is then able to notify others that join can proceed.

Updated by ioquatix (Samuel Williams) over 4 years ago Actions #3 [ruby-core:103813]

I found the reason for this and I have made a PR which I think addresses this. I'll use this as a test case.

https://github.com/ruby/ruby/pull/4471

Updated by ioquatix (Samuel Williams) over 4 years ago Actions #4 [ruby-core:103845]

Okay, now rather than SEGV, I get unlimited number of

undefined method `unblock' for #<Scheduler:0x000000010a1b1fb0> (NoMethodError) 

which I think is at least somewhat better. So I'll merge the PR.

Updated by jeremyevans0 (Jeremy Evans) over 4 years ago Actions #5

  • Status changed from Open to Closed
Actions

Also available in: PDF Atom