I'm setting up a new PG13 cluster, on ubuntu 20.04, virtual machine. The performance I'm seeing, via pgbench, is neither good nor bad. What I'm trying to work out is why it is what it is, and not higher, by trying to find where it's being capped.
By running pgbench I had assumed that I would see an obvious bottleneck:
- in CPU. And yet htop shows none of the cores totally maxed. Load is around or less than 2 on a 4 core machine.
- in disk. Running other benchmarks, I can see the disks can run way faster than what's happening under pgbench. The numbers are reasonable. Doubling the disk speed (iSCSI) makes no difference.
- in network. But iftop shows all the numbers are reasonable. Going direct via local socket instead of TCP does speed things up (about double tps). But the bottleneck at that point is still not cpu or disk.
Above shows an experiment going via 127.0.0.1 (to ensure network stack is involved). Iotop (if I understand it correctly) is showing that <1% of the process's time is blocked on IO.
Default pgbench params, but 30 clients:
pgbench -h 127.0.0.1 -U myuser mydb -T 50 -c 30 But they're all blocked - waiting - during the update statement. Is there something obvious I'm missing here? What other tools can I run to understand the issue better?
EDIT to add later pgbench output
nik@pgdb2:~$ sudo -u postgres pgbench -s 100 -h pgdb1 -p 5433 -c 30 -T 5 -j 2 Password: scale option ignored, using count from pgbench_branches table (100) starting vacuum...end. transaction type: <builtin: TPC-B (sort of)> scaling factor: 100 query mode: simple number of clients: 30 number of threads: 2 duration: 5 s number of transactions actually processed: 9814 latency average = 15.372 ms tps = 1951.612278 (including connections establishing) tps = 1955.965884 (excluding connections establishing) 
