- Notifications
You must be signed in to change notification settings - Fork 157
Add RTLD_GLOBAL support to FFI for Python C extension compatibility #3144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
It being explicitly doesn't automatically make it non-breaking. If that API were exposed it would be a breaking change as any call of that function would now be missing an argument. Also at all use-sites it is currently hard coded to true which doesn't preserve behaviour i.e. RTLD_GLOBAL is used instead of RTLD_LOCAL
Adding that option is left to future work so, no. As is I disagree regarding the backwards compatibility claim given the current state of this PR. Changing from RTLD_LOCAL to RTLD_GLOBAL and introducing the possibility of symbol conflict between different loaded libraries is most definitely a backwards incompatibility. |
| Yes, good catch! the default is supposed to be the existing behavior. |
b3bee9f to fe992f3 Compare | @Skgland these were good points. I am changing it to being a completely independent pathway so that the user can make an informed choice and it does not interfere with the existing FFI API. |
fe992f3 to b54ac34 Compare | I don't see the benefit of having completely separate path ways. use_foreign_module(Library, Predicates) :- use_foreign_module(Library, Predicates, []). use_foreign_module(Library, Predicates, Options) :- (Options = [mode(Mode)] -> true ; Mode = local), % options parsing oversimplified '$load_foreign_lib'(LibName, Predicates, Mode), maplist(assert_predicate, Predicates). |
This adds opt-in RTLD_GLOBAL support for FFI library loading while maintaining backwards compatibility with the default RTLD_LOCAL behavior. Changes: - Add new $load_foreign_lib_global/2 builtin alongside existing $load_foreign_lib/2 - Add load_library_global() function in src/ffi.rs (separate from load_library()) - Implement RTLD_GLOBAL flag on Unix systems in load_library_global() - Add use_foreign_module_global/2 predicate in ffi.pl for opt-in usage - Default behavior remains RTLD_LOCAL (safe, no symbol pollution) This is required for Python C extensions (NumPy, SciPy, pandas, etc.) which need to resolve symbols from libpython when embedding Python via FFI. Tested with NumPy 2.3.4 - all C extension imports and operations work correctly. J.J.'s robot
b54ac34 to 01fe37b Compare
Ah ok, I misunderstood. When you said "reserverd for future options", I thought you meant, "don't use the third arrity for anything". You literally meant "use it for options" as in "a list of options". I agree with this. I will come back to this after work. |
| By the way @Skgland if you have an alternative idea besides using RTLD for enabling the C Python libraries I would be happy to hear it. As far as I can tell this is the technique used by JNA and other similar libraries to allow this. |
|
These are very good questions. Python expects symbols shared across extension modules to use RTLD_GLOBAL. The net result is that if we open the python shared library in Windows is a little different because C extensions are required to be compiled against the DLL at the time the library is built and global resolution is accomplished in the IAT. You still need to provide the handle of the DLL, but the module has all of the resolution baked internally. This is not the case in Unix-based systems, which require the handle AND the symbols to be available in the global address space. So I am not aware of any other method to make the Python symbols available to the C extension module code without |
Implements flexible opt-in RTLD_GLOBAL support through use_foreign_module/3 with an options list parameter, replacing the separate _global variant. Changes: - Modified $load_foreign_lib from arity 2 to arity 3 to accept options list - Added options parsing to extract flags([rtld_global]) from option list - Updated load_library() in ffi.rs to accept use_global boolean parameter - Implemented conditional RTLD_GLOBAL flag on Unix when use_global is true - Updated ffi.pl to expose use_foreign_module/3 with options - Maintained backwards compatibility with use_foreign_module/2 (defaults to empty options) - Removed separate load_foreign_lib_global builtin and use_foreign_module_global predicate API: use_foreign_module(LibPath, Functions, [flags([rtld_global])]) % RTLD_GLOBAL use_foreign_module(LibPath, Functions, []) % RTLD_LOCAL (default) use_foreign_module(LibPath, Functions) % RTLD_LOCAL (default) This follows Scryer's existing pattern for option lists (similar to open/4) and provides a clean, extensible API for future FFI loading options. Required for Python C extensions (NumPy, SciPy, pandas) which need to resolve symbols from libpython when embedding Python via FFI. Tested with NumPy 2.3.4 - all C extension imports and operations work correctly. J.J.'s robot
Instead of adding boolean parameters for each new FFI option, use a LibraryLoadOptions struct that can be easily extended with new options in the future. Also improved options parsing code to use early continues and cleaner patterns instead of deeply nested if statements. J.J.'s robot
Given that the rust side there is not exposed/public API surface this isn't really a place where we need to worry about extensibility as we can always change it without breaking API compatibility. With only one Boolean option I would have probably just used a bool, but a struct seems reasonable and nice. Though if we already go the extra mile and are introducing a proper options type it might be nice to have a Scope enum rather than a bool. #[derive(Debug, Clone, Copy, Default)] enum RtdlScope { #[default] Local, Global, } |
So one thought about this is how many options we want to make available from the posix.1-2001 spec https://pubs.opengroup.org/onlinepubs/009695399/functions/dlopen.html we can always add more args later since it is internal but regarding this, there are some options to consider (forgive me if this is all already familiar territory to you). The RTLD flags can be OR'd together with The standard ones are global/local (scope), lazy/now (binding), and then the gnu only extensions are noload, nodelete, and deepbind. I've never really worked w/the last 3 so I'd feel ok omitting them for the sake of portability. Should we have two enums, one for scope and one for binding? |
Replace LibraryLoadOptions struct with separate RtldScope and RtldBinding enums, parsed from options list format. Changes: - New enums: RtldScope (Local/Global), RtldBinding (Lazy/Now) - Options format: [scope(global), binding(lazy)] - Unspecified options use POSIX defaults (local, lazy) - Forward compatible: unknown options ignored - Updated documentation with examples API: use_foreign_module(Lib, Fns, [scope(global)]) use_foreign_module(Lib, Fns, [scope(local), binding(now)]) use_foreign_module(Lib, Fns, []) % defaults Tested with NumPy - works correctly. J.J.'s robot
| Based on https://man7.org/linux/man-pages/man3/dlopen.3.html
Based on this not specifying neither is the same as RTLD_LOCAL and both doesn't make much sense to me either, so again exactly one seems reasonable. So this can be separate enum/option from the rest of the flags.
I think this means exactly one of the values rather than at least one (also both appears somewhat contradictory). Based on that if I were to add it I would again use a separate enum. Further I don't see much use in RTLD_NODELETE, RTLD_NOLOAD given the current exposed API surface. RTLD_DEEPBIND sounds useful, but I think this can be left out for now and implemented/added once someone has a use-case for that. SummaryIn summary I would just add one scope enum with RTLD_GLOBAL and RTLD_LOCAL. The other values I would leave to be added when a use-case comes up. If you want to add more or all of them I would make a separate eagerness enum for RTLD_LAZY and RTLD_NOW. |
| Ok. I will revert to single scope enum argument, I don't really have a need for the eagerness/binding right now. 6dde2d8 this has both as a reference for later perhaps. |
Remove RtldBinding enum and binding option from FFI interface. Libraries now always use RTLD_LAZY (lazy binding - symbols resolved as needed), which is the standard and typical choice. Changes: - Removed RtldBinding enum entirely - Keep only RtldScope enum (Local/Global) - Always use RTLD_LAZY in library loading - Simplified options parsing to handle only scope - Updated documentation to reflect scope-only API API: use_foreign_module(Lib, Fns, [scope(global)]) use_foreign_module(Lib, Fns, [scope(local)]) use_foreign_module(Lib, Fns, []) % default: local + lazy Tested with NumPy - works correctly. J.J.'s robot
| The Regarding the rest I have two points of contention.
See for example how this is done for scryer-prolog/src/lib/process.pl Lines 46 to 69 in e4d9692
scryer-prolog/src/machine/system_calls.rs Lines 8572 to 8590 in e4d9692
|
Move FFI options parsing from Rust to Prolog side following the pattern used by library(process). This improves code organization and provides better error messages. Changes: - Parse and validate options in ffi.pl before passing to Rust - Pass simple atom (local/global) to Rust instead of options list - Throw domain_error for unknown options (no silent ignoring) - Throw domain_error for duplicate scope options - Throw domain_error for invalid scope values - Add call_with_error_context for proper error reporting Tests: - Add test for unknown option error - Add test for duplicate option error - Add test for invalid scope value error - Verify existing FFI functionality unchanged - All tests pass (36 passed) Addresses feedback from PR mthom#3144 review. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Move FFI options parsing from Rust to Prolog side following the pattern used by library(process). This improves code organization and provides better error messages. Changes: - Parse and validate options in ffi.pl before passing to Rust - Pass simple atom (local/global) to Rust instead of options list - Throw domain_error for unknown options (no silent ignoring) - Throw domain_error for duplicate scope options - Throw domain_error for invalid scope value - Add call_with_error_context for proper error reporting Tests: - Add test for unknown option error - Add test for duplicate option error - Add test for invalid scope value error - Verify existing FFI functionality unchanged - All tests pass (36 passed) Addresses feedback from PR mthom#3144 review.
8f75a6e to 1e1853e Compare
(the robot is being a little dramatic, this really applies to Conda. But a lot of folks still use Conda for scientific computing, so it would be nice for us to let them do that via Scryer!)
re: Scryer Python
Problem
When embedding Python via FFI on Unix systems, Python C extension modules (NumPy, SciPy, pandas, and even standard library modules like
math,socket,_random) fail to load with "undefined symbol" errors.This occurs because Scryer Prolog's FFI loads shared libraries with
RTLD_LOCALby default (vialibloading::Library::new()), which isolates symbols. Python C extensions need to resolve symbols from libpython, but withRTLD_LOCAL, these symbols are not visible to subsequently loaded libraries.Example Error
```
Why This Matters
Python C extensions are ubiquitous in the Python ecosystem:
Without RTLD_GLOBAL support, none of these work when Python is embedded via Scryer Prolog's FFI.
Solution
Add a `use_global: bool` parameter to `ForeignFunctionTable::load_library()` that, when `true` on Unix systems, loads libraries with `RTLD_GLOBAL | RTLD_LAZY` flags using `libloading::os::unix::Library::open()`.
Changes
src/ffi.rs:
src/machine/system_calls.rs:
src/lib/ffi.pl:
Testing
Successfully tested with conda Python 3.11 + NumPy:
```prolog
?- use_module(library(ffi)).
?- use_foreign_module('/opt/conda/envs/myenv/lib/libpython3.11.so', [...]).
?- py_initialize.
?- py_run_simple_string("import numpy as np").
?- py_run_simple_string("print(f'NumPy version: {np.version}')").
NumPy version: 1.26.4
?- py_run_simple_string("arr = np.array([1,2,3,4,5])").
?- py_run_simple_string("print(f'Sum: {arr.sum()}')").
Sum: 15
```
Also verified with standard library modules (`import math`, `import socket`) that previously failed.
Backwards Compatibility
This change is backwards compatible:
Prior Art
Other language embedders use RTLD_GLOBAL for Python:
Future Work
Potential enhancements:
Checklist
This PR enables a significant expansion of Scryer Prolog's FFI capabilities, particularly for Python embedding use cases. Happy to adjust the approach based on maintainer feedback!
Signed-off-by: J.J.'s robot