I'm having issues with the disk performance of a three node Windows Server 2019 failover cluster, using Storage Spaces Direct.
Each node has 4 x 3.2TB SSDs and 16 x 2.4TB HDDs. The cluster storage is configured for 3-way mirroring. There's 25Gbps networking between each node.
Using diskspd in a virtual machine running on the cluster is showing write speeds as low as 70MiB/s. The same test running on a similar virtual machine running under QEMU/KVM with spinning disks gets over 500MiB/s.
As far as I can see, all the SSDs have auto-configured as Journal disks, and are healthy, as I'd expect.
Get-PhysicalDisk -Model 'MO003200KXPTT' | Select-Object FriendlyName,DeviceID, HealthStatus,OperationalStatus,AdapterSerialNumber,BusType,Size,CannotPoolReason, CanPool, usage | ft FriendlyName DeviceID HealthStatus OperationalStatus AdapterSerialNumber BusType Size CannotPoolReason CanPool Usage ------------ -------- ------------ ----------------- ------------------- ------- ---- ---------------- ------- ----- NVMe MO003200KXPTT 3003 Healthy OK SAS 3200631791616 In a Pool False Journal NVMe MO003200KXPTT 1003 Healthy OK SAS 3200631791616 In a Pool False Journal NVMe MO003200KXPTT 3001 Healthy OK SAS 3200631791616 In a Pool False Journal NVMe MO003200KXPTT 2001 Healthy OK SAS 3200631791616 In a Pool False Journal NVMe MO003200KXPTT 3002 Healthy OK SAS 3200631791616 In a Pool False Journal NVMe MO003200KXPTT 2004 Healthy OK SAS 3200631791616 In a Pool False Journal NVMe MO003200KXPTT 2002 Healthy OK SAS 3200631791616 In a Pool False Journal NVMe MO003200KXPTT 1002 Healthy OK SAS 3200631791616 In a Pool False Journal NVMe MO003200KXPTT 3004 Healthy OK SAS 3200631791616 In a Pool False Journal NVMe MO003200KXPTT 1004 Healthy OK SAS 3200631791616 In a Pool False Journal NVMe MO003200KXPTT 2003 Healthy OK SAS 3200631791616 In a Pool False Journal NVMe MO003200KXPTT 1001 Healthy OK SAS 3200631791616 In a Pool False Journal However, looking in Performance Monitor, I can see that node 1 doesn't appear to be using any of the cache, which I suspect is what is causing the slow performance.
Node 1 Cluster Storage Hybrid Disks
Node 2 Cluster Storage Hybrid Disks (node 3 similar)
The performance monitor object Cluster Storage Cache Stores on node 1 has 4 instances, which tally up with the 4 SSDs, however they're showing 0 active and enabled bindings, whereas node 2 and 3 do have bindings, which doesn't seem to be right.
Node 1 Cluster Storage Cache Stores
Node 2 Cluster Storage Cache Stores (node 3 similar)
Any suggestions to how to get S2D to actually use the cache disks?