-
- Notifications
You must be signed in to change notification settings - Fork 33.1k
Description
Bug report
Bug description:
Apologies if this is a duplicate. I couldn’t find a similar report, though.
The issue and how to reproduce
We’re seeing a performance regression since 124af17. The best way to reproduce it is to spawn lots of processes from a ThreadPoolExecutor
. For example:
#!/usr/bin/env python3 from concurrent.futures import ThreadPoolExecutor, wait from subprocess import Popen def target(i): p = Popen(['touch', f'/tmp/reproc-{i}']) p.communicate() executor = ThreadPoolExecutor(max_workers=64) futures = set() for i in range(10_000): futures.add(executor.submit(target, i)) wait(futures)
Before, on 49cae39, it’s roughly this:
real 0m2.419s user 0m4.524s sys 0m0.976s
Since 124af17, it’s roughly this:
real 0m11.772s user 0m10.287s sys 0m14.409s
An attempt at an analysis and possible fix
strace
shows that the new code doesn’t use vfork()
anymore but clone()
. I believe that the reason for this is an incorrect check of num_groups
(or extra_group_size
, as it is now called on main
).
124af17 checks if extra_group_size
is less than zero to determine if we can use vfork()
. Is that correct? Maybe this should be equal to zero? I’m talking about these two locations (diff relative to main
/9e56eedd018e1a46):
diff --git a/Modules/_posixsubprocess.c b/Modules/_posixsubprocess.c index 2898eedc3e..fb6c235901 100644 --- a/Modules/_posixsubprocess.c +++ b/Modules/_posixsubprocess.c @@ -889,7 +889,7 @@ do_fork_exec(char *const exec_array[], /* These are checked by our caller; verify them in debug builds. */ assert(uid == (uid_t)-1); assert(gid == (gid_t)-1); - assert(extra_group_size < 0); + assert(extra_group_size == 0); assert(preexec_fn == Py_None); /* Drop the GIL so that other threads can continue execution while this @@ -1208,7 +1208,7 @@ subprocess_fork_exec_impl(PyObject *module, PyObject *process_args, /* Use vfork() only if it's safe. See the comment above child_exec(). */ sigset_t old_sigs; if (preexec_fn == Py_None && allow_vfork && - uid == (uid_t)-1 && gid == (gid_t)-1 && extra_group_size < 0) { + uid == (uid_t)-1 && gid == (gid_t)-1 && extra_group_size == 0) { /* Block all signals to ensure that no signal handlers are run in the * child process while it shares memory with us. Note that signals * used internally by C libraries won't be blocked by
extra_group_size
is the result of the call to PySequence_Size(extra_groups_packed)
. If I understand the docs correctly, then this function only returns negative values to indicate errors. This error condition is already checked, right after the call itself:
extra_group_size = PySequence_Size(extra_groups_packed); if (extra_group_size < 0) goto cleanup;
Later in the code, extra_group_size
can never be less than zero. It can, however, be equal to zero if extra_groups
is an empty list. I believe this is what was meant to be checked here.
I’ll happily open a PR for this if you agree that this is the way to go.
CPython versions tested on:
3.11, 3.12, 3.13, CPython main branch
Operating systems tested on:
Linux
Linked PRs
Metadata
Metadata
Assignees
Labels
Projects
Status