Skip to content

Conversation

adamziel
Copy link
Collaborator

@adamziel adamziel commented Apr 23, 2024

Adds a PHP Process manager, a class that can spawn maintain a number of PHP instances at the same time.

The idea is to have:

  • A single "primary" PHP instance that's never killed – it contains the reference filesystem used by all other PHP instances.
  • A pool of disposable PHP instances that are spawned to handle a single request and reaped immediately after.

Spawning new PHP instances is reasonably fast and can happen on demand, there's no need to keep a pool of "warm" instances that occupies the memory all the time.

Example usage:

const mgr = new PHPProcessManager({ phpFactory: async () => NodePHP.load(RecommendedPHPVersion), maxPhpInstances: 4, }); const instance = await mgr.getInstance(); await instance.php.run({ code }); instance.reap();

Related to #1287

Remaining work

  • Add a "context" parameter to getInstance() to enable setting the correct SAPI Name, e.g. CLI or FPM depending on the purpose the PHP instance is created for. This would solve issues like PHP content type headers ignored? #1223. Alternatively, the consuming program could call setSapiName() on the PHP instance returned by getInstance() – but that assumes the factory did not initialize the web runtime.

Testing instructions

Confirm the unit tests pass. This PR only adds new code, it does not plug in the PHPProcessManager class into any request dispatching code yet.

Adds an optional `timeout` option to the Semaphore class that throws an error when the Semaphore cannot be acquired for `timeout` milliseconds. Related to #1287 ## Testing instructions Confirm the unit tests pass
Adds a PHP Process manager, a class that can spawn maintain a number of PHP instances at the same time. The idea is to have: * A single "primary" PHP instance that's never killed – it contains the reference filesystem used by all other PHP instances. * A pool of disposable PHP instances that are spawned to handle a single request and reaped immediately after. Spawning new PHP instances is reasonably fast and can happen on demand, there's no need to keep a pool of "warm" instances that occupies the memory all the time. Example usage: ```ts const mgr = new PHPProcessManager({	phpFactory: async () => NodePHP.load(RecommendedPHPVersion),	maxPhpInstances: 4, }); const instance = await mgr.getInstance(); await instance.php.run({ code }); instance.reap(); ``` Related to #1287 ## Testing instructions Confirm the unit tests pass. This PR only adds new code, it does not plug in the PHPProcessManager class into any request dispatching code yet.
adamziel added a commit that referenced this pull request Apr 23, 2024
Memoizes fetch() responses related to `php.wasm` files to avoid slow network calls every time a new PHP instance is spawned by [PHPProcessManager](#1301). This is related to [Loopback request support](#1287) where we have to spawn additional PHP instances on demand. ## Testing instructions Confirm CI checks passed.
adamziel added a commit that referenced this pull request Apr 23, 2024
Memoizes fetch() responses related to `php.wasm` files to avoid slow network calls every time a new PHP instance is spawned by [PHPProcessManager](#1301). This is related to [Loopback request support](#1287) where we have to spawn additional PHP instances on demand. ## Testing instructions Confirm CI checks passed.
Base automatically changed from semaphore-timeout to trunk April 23, 2024 10:40
@adamziel
Copy link
Collaborator Author

cc @dmsnell @brandonpayton @bgrgicak for reviews – this is an important component and I'd love to discuss it before moving further.

adamziel added a commit that referenced this pull request Apr 23, 2024
Memoizes fetch() responses related to `php.wasm` files to avoid slow network calls every time a new PHP instance is spawned by [PHPProcessManager](#1301). This is related to [Loopback request support](#1287) where we have to spawn additional PHP instances on demand. ## Testing instructions Confirm CI checks passed.
@bgrgicak
Copy link
Collaborator

this is an important component and I'd love to discuss it before moving further.

The approach makes sense to me. Do you have any specific concerns?

@adamziel
Copy link
Collaborator Author

adamziel commented Apr 24, 2024

The approach makes sense to me. Do you have any specific concerns?

Nothing specific, I just wanted to make sure I get some really smart folks to take a look :-)

Copy link
Member

@brandonpayton brandonpayton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamziel, here are my thoughts and questions so far. I may keep reading and thinking about this for a bit but want to get some feedback out there.

In addition to the inline feedback, some other questions are:

  • Does there need to be a primary PHP instance?
  • What happens if a consumer disposes of the primary PHP instance? Does it matter?
  • Would it make sense to track parent/child relationships? Maybe that is something best left to the consumer of the PHP Process Manager.
@adamziel adamziel self-assigned this Apr 24, 2024
@adamziel
Copy link
Collaborator Author

adamziel commented Apr 25, 2024

All good notes @brandonpayton, thank you!

Does there need to be a primary PHP instance?

#1301 (comment)

What happens if a consumer disposes of the primary PHP instance? Does it matter?

They don't get access to a reap() function that would actually call php.exit(). If they do call php.exit() manually, their program will break. It's a footgun for sure, but I'm okay with responding to that "well just don't call exit() then" :-) If it bites us at one point, we can add a check like if(primary.hasExited()) { throw new Error("The primary PHP instance have exited and this PHPProcessManager can no longer be used. Make sure to avoid calling php.exit(), that's managed for you. "); }

Would it make sense to track parent/child relationships? Maybe that is something best left to the consumer of the PHP Process Manager.

I was thinking about that and here's what I realized:

  • It would be handy to kill the entire child process group, as in Erlang's supervisor trees.
  • We can track that when PHP spawns another one directly, as in proc_open(), but we cannot easily track that when PHP issues a loopback request via HTTP and that request spawns another process – in this case it would technically be a child of the request handler.
  • At the same time, the request handler isn't a WASM module in the same sense as PHP so now we'd have to introduce more OS-like primitives with a generic process interface, a process table or a supervisor tree, potentially implement file descriptors outside of WASM context etc. I'd rather avoid that complexity while we can. Eventually perhaps we could just build a real process manager to WASM, add custom handlers, and get a lot of that for free?
@adamziel
Copy link
Collaborator Author

The unit test failure is an intermittent timeout. I've addressed the feedback and I'll go ahead and merge this one. Thank you so much for the discussion!

@adamziel adamziel added the [Type] New API New API to be used by package users. label Apr 25, 2024
@adamziel adamziel merged commit 2c55e0f into trunk Apr 25, 2024
@adamziel adamziel deleted the php-process-mgr branch April 25, 2024 08:57
adamziel added a commit that referenced this pull request Apr 25, 2024
Adds support for PHP spawning more PHP instances: ```ts const handler = new PHPRequestHandler({	documentRoot: '/',	phpFactory: async ({ isPrimary }) => {	const php = await createPhp(requestHandler);	if (!isPrimary) {	proxyFileSystem(	await requestHandler.getPrimaryPhp(),	php,	requestHandler.documentRoot	);	}	return php;	},	maxPhpInstances: 2, }); const php = await handler.getPrimaryPhp(); php.writeFile('/second-script.php', `<?php echo 'Hi from another PHP instance!';`); php.writeFile(	'/main.php',	`<?php	require "/wordpress/load.php";	// Can send HTTP requests to self	echo file_get_contents( get_site_url() . '/second-script.php' );	// Can execute PHP scripts	echo exec( 'php /second-script.php );	` }) await startLocalServer( handler ); await handler.request({ url: '/main.php', }); ``` Without this PR, we only have a single PHP instance running. If that instance needs to run another PHP script while it's running, it will get error 502 as there's no other PHP instance available to handle that. Closes #1182 Closes #1177 ## Technical implementation This PR draws a clear line between BaesPHP and PHPRequestHandler. Before this PR, BasePHP could be given a `serverOptions` argument and it would create a `new PHPRequestHandler` as a convenience. After this PR, BasePHP longer instantiates `PHPRequestHandler`. Why? Two reasons: * We may have multiple PHP instances but only ever want a single `PHPRequestHandler` to orchestrate them. * `PHPRequestHandler` must know how to start new PHP instances, but BasePHP shouldn't be concerned with that at all. separates PHP leverages two major building blocks: Furthermore, `PHPRequestHandler` now uses [PHPProcessManager](#1301) to spawn new PHP instances on demand. To create a new request handler, you now need to pass a `processManager` instance or, alternatively, a `phpFactory` function. Ideally, `BasePHP` would have no more references to `PHPRequestHandler` after this PR, but unfortunately we cannot fully decouple the two just yet. `BasePHP` exposes an `absoluteUrl`, `documentRoot`, and a `request` method that are used by Blueprints, and Blueprint steps only get a `BasePHP` instance without any extra contextual information. This will be easier to resolve once we switch to [PHP Blueprints](https://github.com/WordPress/blueprints-library/) where we'll never call `php.request()` and we'll also be able to access these contextual information via `$_SERVER` and such. ## Follow up work * #1233 * Create a separate, minimal Emscripten module as a source of the Filesystem. Remove the concept of a "primary" PHP instance. * Decouple BasePHP from the PHPRequestHandler. The dependencies are super intertwined now – BasePHP needs a PHPRequestHandler instance, but PHPRequestHandler acts on BasePHP. PHPProcessManager needs both, but also configuring a spawn handler requires a PHPProcessManager. * `UniversalPHP` and such are interchangeable with both `BasePHP` and `WebPHPEndpoint`. This isn't intuitive, especially now that `WebPHPEndpoint` acts on many PHP instances. Let's make `BasePHP` distinct from `WebPHPEndpoint`. They will share some methods and props, but will also have disjoint ones.
adamziel added a commit that referenced this pull request Apr 25, 2024
…xhausted (#1324) Fixes a bug in the [PHPProcessManager](#1301) where `acquirePHPInstance()` would continue returning `this.nextInstance` even after the semaphore concurrency limit was exhausted. The solution is to clear `this.nextInstance` for the next `acquirePHPInstance()` call. ## Testing instructions Confirm the CI checks pass
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3 participants