So normally you can get PHP FPM's status by querying the /fpmstatus
endpoint (or whatever is configured). This displays info about FPM's status, threads, etc. FPM has a maximum number of child threads it allows (e.g. 48 in our case). Under normal circumstances the requests are processed in a fraction of a second, so that usually no more than about a third of the child processes are Running at any given time; normally they are Idle. This means that there's always at least one child available to handle the status check.
However if we experience a sudden influx of traffic, each FPM instance can suddenly find all 48 of its children occupied indefinitely; so many requests are incoming that the instant a request completes, there's another one running.
Here's the problem: When this happens, it becomes impossible to ask FPM for its status, because the status check needs one of the children available in order to run. You'd think we could just scale up our pods (yes, we're using Kubernetes) but the problem is that we use the process utilization of FPM to trigger scaling up... and if we can't find out what the process utilization is, then we can't automatically find out when it's too high.
Generally speaking no monitoring mechanism should use the very mechanism it's monitoring to do the monitoring, because if that mechanism becomes unavailable, the monitoring no longer works. FPM's design does not account for this; there appears to be no out-of-band mechanism for getting FPM's status.
I realize we could use proxies like examining the process table, but there simply being 48 child processes does not mean that they're actually being used. Hopefully there's some simple answer to this I've missed, but otherwise I'm going to have to delve into attempting to get FPM to add the ability to make out-of-band status requests.