[Prepitch]: Scoped, Fixed Width Concurrency Executors

ktoso · March 10, 2025, 3:42am

Overall I think there's something to address here, but I'd like to clear up some statements made here before we dive into what the actual request and potential solutions might be.

In no way are custom executors unsafe; I'm not sure what you mean here?

This is exactly what task executor preferences are. There is no need for new Task APIs here I think. Especially since they already address the whole task tree as you're asking for.

Hm this is a bit confusing the DispatchQueue docs claim the protocol is adopted already:

..., but it seems it doesn't in macOS 15.3 yet AFAICS.

Without getting into release details I can confirm though that yes, DispatchQueue is supposed to conform to TaskExecutor and will eventually. We also have the same "lagging adoption" problem in corelibs dispatch which is the queue impls on Linux; we'll also need to make sure the adoption is done there properly, but yes, in both places it is expected for DispatchQueue to implement this protocol eventually.

The OP question is a bit vague but in my experience this use-case comes up when one would like to offload some blocking work off from the global shared pool.

In that case, just changes in task group "width" are not going to make a difference at all.

You can still have a bunch of tasks end up in those "width limited groups" and have them all end up blocking the global pool, even if they're all "(task group) width: 2" etc.

So this isn't about the width of a group, but about using different set of threads for the "other work";

This is a pattern we've successfully used in production deployments of Akka in my previous life: putting the "bad blocking legacy calls" onto some other thread pool.

Back to the root of the question though:

What you're actually getting at is that an over subscribed system is limited by the thread count in the global pool. In reality if all work is CPU bound and not "blocking on IO" throwing more threads at it will not make much of a difference. It's still the same number of CPUs and we'd only increase thread switching between all the work which ultimately is still all just going to be executed on those same CPUs. Note that the thread count AFAIR is already some heuristic that is 2x of the number of cores etc.

So when does throwing more threads at such problem "help" in the sense of preventing starvating the shared pool? Well, if there is some "bad blocking work" (calling into blocking C APIs for example, or blocking IO -- for which now we'll be able to get away from on Linux thanks to io_uring in swift-system) that we could move over to some dedicated threads: say, we're ok with 4 threads being dedicated to blocking IO.

So we have some section of the code we know to do such calls, and we want to section it off then... That's the reason why task executors exist!

They provide a way to "execute this task (and its child tasks) on this executor (which may have many threads at disposal)."

The existing APIs: withTaskGroupPreference() and Task(executorPreference:) achieve just that, a section of the task hierarchy will execute on given executor.

DispatchQueue will implement TaskExecutor and NIO or other thread pools could also implement it. We are also actively working on custom global executor customization proposal, and I suspect a follow up there might be to expose some platform specific (e.g. pthreads based) executor impl which you could give "use 4 threads please" and you could then use it as a task executor...

--

Summing up, I don't think this prepitch is really pitching any new API? We have the APIs. You can write a bunch of task executors and make a package for them right now even.

Perhaps we should offer more executor implementations, but that is on the radar already and @al45tair is working on those -- perhaps as a side effect we'll make them possible to be instantiated and used as specific task executors? Maybe, we'll see.