Skip to content

Conversation

@bruno-
Copy link
Owner

@bruno- bruno- commented Dec 5, 2020

Ruby ticket for this PR: https://bugs.ruby-lang.org/issues/17370

Related PR for Scheduler#name_resolve hook: https://github.com/bruno-/ruby/pull/2

@bruno- bruno- force-pushed the scheduler_address_resolve_hook branch from 579d7da to c3034c9 Compare December 5, 2020 02:10
{
VALUE scheduler = rb_scheduler_current();

if (scheduler != Qnil && rb_scheduler_supports_io_write(scheduler)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (scheduler != Qnil && rb_scheduler_supports_io_write(scheduler)) {
if (scheduler != Qnil && rb_scheduler_supports_address_resolve(scheduler)) {
if (ret == 0)
allocated_by_malloc = 1;
else {
VALUE scheduler = rb_scheduler_current();
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is Addrinfo.getaddrinfo function call chain:

  • addrinfo_s_getaddrinfo
  • addrinfo_list_new
  • call_getaddrinfo
  • rsock_getaddrinfo
  • rb_getaddrinfo 👈 we are adding a hook here
  • nogvl_getaddrinfo
  • getaddrinfo syscall

A hook is added so "deep" because:

  • Other ruby methods are calling deep into the above function chain. For example Socket.getnameinfo is calling rb_getaddrinfo directly.
  • rb_getaddrinfo function is in ext/socket/rubysocket.h. So, can rb_getaddrinfo be used in ruby extensions? If yes, we're covered: extension function calls will also be delegated to a Scheduler hook.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Option 1: // add this interface to core ruby rb_address_resolve(...) # not in an extension // rb_getaddrinfo -> calls rb_address_resolve if it's blocking lookup. // Option 2: rb_getaddrinfo -> core ruby -> invoke the hook // Option 3: // expose rb_scheduler functions
if (scheduler != Qnil && rb_scheduler_supports_address_resolve(scheduler)) {
VALUE host = rb_str_new_cstr(node);
VALUE rb_timeout = rb_time_timespec_new(timeout);
VALUE ip_addresses_array = rb_scheduler_address_resolve(scheduler, host, rb_timeout);
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is rb_getaddrinfo_a function, the equivalent of rb_getaddrinfo. The only difference is it receives a timeout argument.
The rest of the logic here is exactly the same.

@bruno- bruno- force-pushed the scheduler_address_resolve_hook branch from e609a6d to 2dbab63 Compare December 7, 2020 17:38
return fiber
end

def address_resolve(hostname, timeout = nil)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want a timeout argument here? Scheduler hook is making address resolving non-blocking, so does it make sense to enforce a timeout?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we shouldn't need to add one, but we might be forced to by the interface.

for(i=0; i<len; i++) {
ip_address_str = array_ptr[i];
hostp = host_str(ip_address_str, hbuf, sizeof(hbuf), &additional_flags);
ip_address_ret = numeric_getaddrinfo(hostp, service, hints, &ai);
Copy link
Owner Author

@bruno- bruno- Dec 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are leveraging an existing non-blocking function numeric_getaddrinfo. It converts numeric host (ipv4 or ipv6 address) with service(port) and hints (flags) into an addrinfo struct.

@bruno-
Copy link
Owner Author

bruno- commented Dec 9, 2020

@ioquatix here's how to reproduce the error in bin/irb:

require "socket" Addrinfo.getaddrinfo("google.com", 80)

This is the expected error:

dyld: lazy symbol binding failed: Symbol not found: _rb_scheduler_current Referenced from: /Users/bruno/src/ruby_build/.ext/x86_64-darwin19/socket.bundle Expected in: flat namespace dyld: Symbol not found: _rb_scheduler_current Referenced from: /Users/bruno/src/ruby_build/.ext/x86_64-darwin19/socket.bundle Expected in: flat namespace 

I also tried using rb_scheduler_get() and I got the same error. Thanks for your help with this!

@ioquatix
Copy link

ioquatix commented Dec 9, 2020

Sorry, I ran out of time today, but I will take a look tomorrow.

@ioquatix
Copy link

ioquatix commented Dec 9, 2020

@ko1 can we expose rb_scheduler functions to extensions?

@ko1
Copy link

ko1 commented Dec 12, 2020

@ko1 can we expose rb_scheduler functions to extensions?

As I explained before, rb_scheduler_ prefix is not acceptable.

@bruno- bruno- force-pushed the scheduler_address_resolve_hook branch from f72a7e1 to e494863 Compare April 4, 2021 20:06
@bruno- bruno- force-pushed the scheduler_address_resolve_hook branch from e494863 to 7ec4291 Compare April 4, 2021 20:09
@ioquatix
Copy link

ioquatix commented Apr 5, 2021

This is starting to shape up really nicely.

@bruno-
Copy link
Owner Author

bruno- commented Apr 5, 2021

@ioquatix I think I need help with compiling (again). I'm still getting the below error when I run tests for "address resolve" feature via make test-all TESTS='fiber/test_address_resolve.rb':

[ 2/10] TestAddressResolve#test_addrinfo_getaddrinfo_domain_blockingdyld: lazy symbol binding failed: Symbol not found: _rb_fiber_scheduler_current Referenced from: /Users/bruno/src/ruby_build/.ext/x86_64-darwin19/socket.bundle Expected in: flat namespace dyld: Symbol not found: _rb_fiber_scheduler_current Referenced from: /Users/bruno/src/ruby_build/.ext/x86_64-darwin19/socket.bundle Expected in: flat namespace 

I'm not sure why am I still getting this.

  • ruby/fiber/scheduler.h is #included in ext/socket/rubysocket.h
  • ext/socket/raddrinfo.c #includes ext/socket/rubysocket.h
  • I ran make update-deps and it made relevant changes to ext/socket/depend

Any tips? What else could I try? Thanks 🙏

@ioquatix
Copy link

ioquatix commented Apr 9, 2021

I have forked this branch into a PR here: ruby#4375

bruno- pushed a commit that referenced this pull request Apr 27, 2021
This commit adds a check on the ep just like in the mark function. The env can contain null bytes if allocation tracing is enabled. We're seeing errors during autocompaction like this: ``` (lldb) bt 40 * thread #1, name = 'ruby', stop reason = signal SIGABRT frame #0: 0x00007f7d64b6018b libc.so.6`raise + 203 frame #1: 0x00007f7d64b3f859 libc.so.6`abort + 299 frame #2: 0x000055af5f2fefc9 ruby`die at error.c:764:5 frame ruby#3: 0x000055af5f2ff1ac ruby`rb_bug_for_fatal_signal(default_sighandler=0x0000000000000000, sig=11, ctx=0x000055af60bc3340, fmt="") at error.c:804:5 frame ruby#4: 0x000055af5f4bd08f ruby`sigsegv(sig=11, info=0x000055af60bc3470, ctx=0x000055af60bc3340) at signal.c:960:5 frame ruby#5: 0x00007f7d64ebe3c0 libpthread.so.0`__restore_rt frame ruby#6: 0x000055af5f339b0a ruby`gc_ref_update_imemo(objspace=0x000055af60b2b040, obj=0x00007f7d5b513fd0) at gc.c:9046:13 frame ruby#7: 0x000055af5f339172 ruby`gc_update_object_references(objspace=0x000055af60b2b040, obj=0x00007f7d5b513fd0) at gc.c:9307:9 frame ruby#8: 0x000055af5f338e79 ruby`gc_ref_update(vstart=0x00007f7d5b510010, vend=0x00007f7d5b513ff8, stride=40, objspace=0x000055af60b2b040, page=0x000055af62577aa0) at gc.c:9452:21 frame ruby#9: 0x000055af5f337846 ruby`gc_update_references(objspace=0x000055af60b2b040, heap=0x000055af60b2b068) at gc.c:9481:9 frame ruby#10: 0x000055af5f336569 ruby`gc_compact_finish(objspace=0x000055af60b2b040, heap=0x000055af60b2b068) at gc.c:4840:5 frame ruby#11: 0x000055af5f335efb ruby`gc_page_sweep(objspace=0x000055af60b2b040, heap=0x000055af60b2b068, sweep_page=0x000055af63a1eb30) at gc.c:5046:13 frame ruby#12: 0x000055af5f3355c5 ruby`gc_sweep_step(objspace=0x000055af60b2b040, heap=0x000055af60b2b068) at gc.c:5214:19 frame ruby#13: 0x000055af5f33daf6 ruby`gc_sweep_rest(objspace=0x000055af60b2b040) at gc.c:5271:2 frame ruby#14: 0x000055af5f33cacd ruby`gc_sweep(objspace=0x000055af60b2b040) at gc.c:5389:2 frame ruby#15: 0x000055af5f33c21d ruby`gc_marks_rest(objspace=0x000055af60b2b040) at gc.c:7555:5 frame ruby#16: 0x000055af5f324d41 ruby`gc_rest(objspace=0x000055af60b2b040) at gc.c:8457:13 frame ruby#17: 0x000055af5f3297d8 ruby`garbage_collect(objspace=0x000055af60b2b040, reason=45568) at gc.c:8318:9 frame ruby#18: 0x000055af5f344ece ruby`garbage_collect_with_gvl(objspace=0x000055af60b2b040, reason=45568) at gc.c:8632:9 frame ruby#19: 0x000055af5f344e61 ruby`objspace_malloc_gc_stress(objspace=0x000055af60b2b040) at gc.c:10592:9 frame #20: 0x000055af5f32ced1 ruby`objspace_xmalloc0(objspace=0x000055af60b2b040, size=64) at gc.c:10767:5 frame ruby#21: 0x000055af5f32ce11 ruby`ruby_xmalloc0(size=64) at gc.c:10988:12 frame ruby#22: 0x000055af5f32cdac ruby`ruby_xmalloc_body(size=64) at gc.c:10997:12 frame ruby#23: 0x000055af5f329415 ruby`ruby_xmalloc(size=64) at gc.c:12942:12 frame ruby#24: 0x00007f7d611c4fe5 objspace.so`newobj_i(tpval=0x00007f7d5b553770, data=0x000055af639031a0) at object_tracing.c:101:35 frame ruby#25: 0x000055af5f5b283f ruby`tp_call_trace(tpval=0x00007f7d5b553770, trace_arg=0x00007fff1016d398) at vm_trace.c:1115:2 frame ruby#26: 0x000055af5f5b50ec ruby`exec_hooks_body(ec=0x000055af60b2b700, list=0x000055af60b2b920, trace_arg=0x00007fff1016d398) at vm_trace.c:304:3 frame ruby#27: 0x000055af5f5b0f24 ruby`exec_hooks_unprotected(ec=0x000055af60b2b700, list=0x000055af60b2b920, trace_arg=0x00007fff1016d398) at vm_trace.c:333:5 frame ruby#28: 0x000055af5f5b0da8 ruby`rb_exec_event_hooks(trace_arg=0x00007fff1016d398, hooks=0x000055af60b2b920, pop_p=0) at vm_trace.c:378:13 frame ruby#29: 0x000055af5f33f8e2 ruby`rb_exec_event_hook_orig(ec=0x000055af60b2b700, hooks=0x000055af60b2b920, flag=1048576, self=0x00007f7d5b5c08c0, id=0, called_id=0, klass=0x0000000000000000, data=0x00007f7d5b513fd0, pop_p=0) at vm_core.h:1989:5 frame ruby#30: 0x000055af5f334975 ruby`gc_event_hook_body(ec=0x000055af60b2b700, objspace=0x000055af60b2b040, event=1048576, data=0x00007f7d5b513fd0) at gc.c:2083:5 * frame ruby#31: 0x000055af5f3342df ruby`newobj_slowpath_wb_protected [inlined] newobj_slowpath(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, objspace=0x000055af60b2b040, cr=0x000055af60b2b910, wb_protected=1) at gc.c:2284:9 frame ruby#32: 0x000055af5f33410f ruby`newobj_slowpath_wb_protected(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, objspace=0x000055af60b2b040, cr=0x000055af60b2b910) at gc.c:2299 frame ruby#33: 0x000055af5f333de9 ruby`newobj_of0(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, wb_protected=1, cr=0x000055af60b2b910) at gc.c:2338:11 frame ruby#34: 0x000055af5f3227ae ruby`newobj_of(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, v1=0x000055af657d88a0, v2=0x000055af657d8890, v3=0x0000000000000000, wb_protected=1) at gc.c:2348:17 frame ruby#35: 0x000055af5f322c5b ruby`rb_imemo_new(type=imemo_env, v1=0x000055af657d88a0, v2=0x000055af657d8890, v3=0x0000000000000000, v0=0x00007f7d5b9d19c8) at gc.c:2434:12 frame ruby#36: 0x000055af5f5a3925 ruby`vm_env_new(env_ep=0x000055af657d88a0, env_body=0x000055af657d8890, env_size=4, iseq=0x00007f7d5b9d19c8) at vm_core.h:1363:33 frame ruby#37: 0x000055af5f5a3808 ruby`vm_make_env_each(ec=0x000055af60b2b700, cfp=0x00007f7d6482fc90) at vm.c:801:11 frame ruby#38: 0x000055af5f5a368d ruby`vm_make_env_each(ec=0x000055af60b2b700, cfp=0x00007f7d6482fc20) at vm.c:752:13 frame ruby#39: 0x000055af5f5a368d ruby`vm_make_env_each(ec=0x000055af60b2b700, cfp=0x00007f7d6482fbb0) at vm.c:752:13 (lldb) f 31 frame ruby#31: 0x000055af5f3342df ruby`newobj_slowpath_wb_protected [inlined] newobj_slowpath(klass=0x00007f7d5b9d19c8, flags=0x000000000000001a, objspace=0x000055af60b2b040, cr=0x000055af60b2b910, wb_protected=1) at gc.c:2284:9 2281 } 2282 GC_ASSERT(obj != 0); 2283 newobj_init(klass, flags, wb_protected, objspace, obj); -> 2284 gc_event_hook_prep(objspace, RUBY_INTERNAL_EVENT_NEWOBJ, obj, newobj_fill(obj, 0, 0, 0)); 2285 } 2286 RB_VM_LOCK_LEAVE_CR_LEV(cr, &lev); 2287 (lldb) p obj (VALUE) $3 = 0x00007f7d5b513fd0 (lldb) f 6 frame ruby#6: 0x000055af5f339b0a ruby`gc_ref_update_imemo(objspace=0x000055af60b2b040, obj=0x00007f7d5b513fd0) at gc.c:9046:13 9043 { 9044 rb_env_t *env = (rb_env_t *)obj; 9045 TYPED_UPDATE_IF_MOVED(objspace, rb_iseq_t *, env->iseq); -> 9046 UPDATE_IF_MOVED(objspace, env->ep[VM_ENV_DATA_INDEX_ENV]); 9047 gc_update_values(objspace, (long)env->env_size, (VALUE *)env->env); 9048 } 9049 break; (lldb) p obj (VALUE) $4 = 0x00007f7d5b513fd0 (lldb) ```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

5 participants