I have some performance issue with mdraid. I have one 18x10TB soft raid6 array that is resyncing at ~70MB/s:
kernel 5.8.13-1.el8 /dev/md0: Version : 1.2 Creation Time : Mon Oct 5 15:11:15 2020 Raid Level : raid6 Array Size : 155136221184 (144.48 TiB 158.86 TB) Used Dev Size : 9696013824 (9.03 TiB 9.93 TB) Raid Devices : 18 Total Devices : 18 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Wed Aug 18 18:35:42 2021 State : clean, degraded, resyncing Active Devices : 17 Working Devices : 18 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 512K Consistency Policy : bitmap Resync Status : 25% complete Name : large2:0 (local to host large2) UUID : bdb63778:b3765982:b257478b:70121351 Events : 500678 Number Major Minor RaidDevice State 18 8 5 0 active sync /dev/sda5 1 8 21 1 active sync /dev/sdb5 2 8 34 2 active sync /dev/sdc2 3 8 50 3 active sync /dev/sdd2 4 8 66 4 active sync /dev/sde2 5 8 82 5 active sync /dev/sdf2 - 0 0 6 removed 7 8 114 7 active sync /dev/sdh2 8 8 130 8 active sync /dev/sdi2 9 8 146 9 active sync /dev/sdj2 10 8 162 10 active sync /dev/sdk2 11 8 178 11 active sync /dev/sdl2 12 8 194 12 active sync /dev/sdm2 13 8 210 13 active sync /dev/sdn2 14 8 226 14 active sync /dev/sdo2 15 8 242 15 active sync /dev/sdp2 16 65 2 16 active sync /dev/sdq2 17 65 18 17 active sync /dev/sdr2 6 8 98 - spare /dev/sdg2 md0 : active raid6 sdg2[6](S) sda5[18] sdr2[17] sdq2[16] sdp2[15] sdo2[14] sdn2[13] sdm2[12] sdl2[11] sdk2[10] sdj2[9] sdi2[8] sdh2[7] sdf2[5] sde2[4] sdd2[3] sdc2[2] sdb5[1] 155136221184 blocks super 1.2 level 6, 512k chunk, algorithm 2 [18/17] [UUUUUU_UUUUUUUUUUU] [=====>...............] resync = 25.0% (2431109212/9696013824) finish=1562.2min speed=77503K/sec bitmap: 9/73 pages [36KB], 65536KB chunk Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util sdg 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 322.00 2.80 77282.40 6.60 16471.00 0.80 98.08 22.22 412.53 126.79 133.40 240.01 2.36 2.53 82.10 sdf 339.40 0.00 77758.40 0.00 16454.40 0.00 97.98 0.00 379.81 0.00 128.91 229.11 0.00 2.39 81.28 sdr 325.00 0.00 77414.40 0.00 16465.20 0.00 98.06 0.00 405.64 0.00 131.83 238.20 0.00 2.40 77.90 sdm 329.00 0.00 78477.60 0.00 16465.40 0.00 98.04 0.00 398.25 0.00 131.02 238.53 0.00 2.38 78.36 sdi 328.60 0.00 77084.00 0.00 16460.40 0.00 98.04 0.00 391.20 0.00 128.55 234.58 0.00 2.48 81.64 sdh 335.40 0.00 77753.60 0.00 16456.20 0.00 98.00 0.00 389.88 0.00 130.77 231.82 0.00 2.42 81.14 sdj 326.40 0.00 77700.80 0.00 16464.80 0.00 98.06 0.00 408.07 0.00 133.19 238.05 0.00 2.48 80.90 sde 328.60 0.00 77700.80 0.00 16462.60 0.00 98.04 0.00 398.74 0.00 131.03 236.46 0.00 2.46 80.92 sdn 332.00 0.00 77050.40 0.00 16456.60 0.00 98.02 0.00 382.56 0.00 127.01 232.08 0.00 2.35 78.12 sdl 324.80 0.00 76341.60 0.00 16461.20 0.00 98.07 0.00 385.14 0.00 125.09 235.04 0.00 2.40 78.00 sdp 326.60 0.00 76789.60 0.00 16461.00 0.00 98.05 0.00 393.01 0.00 128.36 235.12 0.00 2.38 77.76 sdq 325.00 0.00 77281.60 0.00 16464.60 0.00 98.06 0.00 404.60 0.00 131.49 237.79 0.00 2.40 77.94 sdk 331.80 0.00 77685.60 0.00 16459.40 0.00 98.02 0.00 386.56 0.00 128.26 234.13 0.00 2.34 77.48 sda 324.20 2.80 77067.20 6.60 16464.40 0.80 98.07 22.22 426.02 135.93 138.74 237.71 2.36 2.59 84.62 sdd 327.60 0.00 77276.00 0.00 16461.80 0.00 98.05 0.00 401.83 0.00 131.64 235.89 0.00 2.47 81.02 sdc 330.80 0.00 77605.60 0.00 16460.00 0.00 98.03 0.00 396.68 0.00 131.22 234.60 0.00 2.46 81.24 sdo 326.40 0.00 77927.20 0.00 16465.60 0.00 98.06 0.00 402.73 0.00 131.45 238.75 0.00 2.40 78.38 md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 .
and second 36x14TB soft raid6 that is doing initial resync at ~40MB/s:
.
kernel 5.13.11-1.el8 Version : 1.2 Creation Time : Tue Aug 17 09:37:39 2021 Raid Level : raid6 Array Size : 464838634496 (432.91 TiB 475.99 TB) Used Dev Size : 13671724544 (12.73 TiB 14.00 TB) Raid Devices : 36 Total Devices : 36 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Wed Aug 18 16:39:11 2021 State : active, resyncing Active Devices : 36 Working Devices : 36 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 512K Consistency Policy : bitmap Resync Status : 32% complete Name : large1:0 (local to host large1) UUID : b7cace22:832e570f:eba39768:bb1a1ed6 Events : 20709 Number Major Minor RaidDevice State 0 8 33 0 active sync /dev/sdc1 1 8 49 1 active sync /dev/sdd1 2 8 65 2 active sync /dev/sde1 3 8 81 3 active sync /dev/sdf1 4 8 97 4 active sync /dev/sdg1 5 8 113 5 active sync /dev/sdh1 6 8 129 6 active sync /dev/sdi1 7 8 145 7 active sync /dev/sdj1 8 8 161 8 active sync /dev/sdk1 9 8 209 9 active sync /dev/sdn1 10 8 177 10 active sync /dev/sdl1 11 8 225 11 active sync /dev/sdo1 12 8 241 12 active sync /dev/sdp1 13 65 1 13 active sync /dev/sdq1 14 65 17 14 active sync /dev/sdr1 15 8 193 15 active sync /dev/sdm1 16 65 145 16 active sync /dev/sdz1 17 65 161 17 active sync /dev/sdaa1 18 65 33 18 active sync /dev/sds1 19 65 49 19 active sync /dev/sdt1 20 65 65 20 active sync /dev/sdu1 21 65 81 21 active sync /dev/sdv1 22 65 97 22 active sync /dev/sdw1 23 65 113 23 active sync /dev/sdx1 24 65 129 24 active sync /dev/sdy1 25 65 177 25 active sync /dev/sdab1 26 65 193 26 active sync /dev/sdac1 27 65 209 27 active sync /dev/sdad1 28 65 225 28 active sync /dev/sdae1 29 65 241 29 active sync /dev/sdaf1 30 66 1 30 active sync /dev/sdag1 31 66 17 31 active sync /dev/sdah1 32 66 33 32 active sync /dev/sdai1 33 66 49 33 active sync /dev/sdaj1 34 66 65 34 active sync /dev/sdak1 35 66 81 35 active sync /dev/sdal1 md0 : active raid6 sdal1[35] sdak1[34] sdaj1[33] sdah1[31] sdai1[32] sdag1[30] sdaf1[29] sdac1[26] sdae1[28] sdab1[25] sdad1[27] sds1[18] sdq1[13] sdz1[16] sdo1[11] sdp1[12] sdx1[23] sdr1[14] sdw1[22] sdn1[9] sdaa1[17] sdv1[21] sdu1[20] sdy1[24] sdt1[19] sdk1[8] sdm1[15] sdl1[10] sdh1[5] sdj1[7] sdf1[3] sdi1[6] sdc1[0] sdg1[4] sde1[2] sdd1[1] 464838634496 blocks super 1.2 level 6, 512k chunk, algorithm 2 [36/36] [UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU] [======>..............] resync = 32.4% (4433869056/13671724544) finish=3954.9min speed=38929K/sec bitmap: 70/102 pages [280KB], 65536KB chunk Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util sdc 9738.60 1.40 38956.00 5.80 0.40 0.40 0.00 22.22 0.20 9.29 1.93 4.00 4.14 0.07 71.82 sdd 9738.20 1.00 38952.80 2.60 0.00 0.00 0.00 0.00 0.89 5.80 8.68 4.00 2.60 0.07 71.60 sde 9738.60 1.40 38956.00 5.80 0.40 0.40 0.00 22.22 0.31 3.71 3.02 4.00 4.14 0.07 70.60 sdf 9738.40 1.00 38953.60 2.60 0.00 0.00 0.00 0.00 0.17 3.20 1.69 4.00 2.60 0.07 70.56 sdg 9738.40 1.00 38953.60 2.60 0.00 0.00 0.00 0.00 0.85 4.20 8.31 4.00 2.60 0.07 70.72 sdh 9738.40 1.00 38953.60 2.60 0.00 0.00 0.00 0.00 0.20 4.00 1.93 4.00 2.60 0.07 70.64 sdi 9738.60 1.00 38954.40 2.60 0.00 0.00 0.00 0.00 0.17 8.20 1.70 4.00 2.60 0.07 70.98 sdj 9714.60 1.00 38954.40 2.60 24.00 0.00 0.25 0.00 0.58 4.00 5.61 4.01 2.60 0.07 70.66 sdk 9677.00 1.00 38953.60 2.60 61.40 0.00 0.63 0.00 1.23 4.40 11.94 4.03 2.60 0.07 70.76 sdl 9738.40 1.00 38953.60 2.60 0.00 0.00 0.00 0.00 0.15 5.80 1.44 4.00 2.60 0.07 70.76 sdm 9738.40 1.00 38953.60 2.60 0.00 0.00 0.00 0.00 0.38 2.80 3.73 4.00 2.60 0.07 70.96 sdo 9705.60 1.00 38953.60 2.60 32.80 0.00 0.34 0.00 0.83 5.80 8.07 4.01 2.60 0.07 70.80 sdp 9738.40 1.00 38953.60 2.60 0.00 0.00 0.00 0.00 0.30 4.20 2.91 4.00 2.60 0.07 70.60 sdn 9738.40 1.00 38953.60 2.60 0.00 0.00 0.00 0.00 0.34 5.60 3.30 4.00 2.60 0.07 70.76 sdt 9659.80 1.00 38954.40 2.60 78.80 0.00 0.81 0.00 1.00 4.00 9.71 4.03 2.60 0.07 70.44 sds 9640.40 1.00 38954.40 2.60 98.20 0.00 1.01 0.00 1.29 5.60 12.42 4.04 2.60 0.07 70.60 sdq 9738.40 1.00 38953.60 2.60 0.00 0.00 0.00 0.00 0.30 4.40 2.92 4.00 2.60 0.07 70.68 sdu 9738.60 1.00 38954.40 2.60 0.00 0.00 0.00 0.00 0.13 4.40 1.31 4.00 2.60 0.07 70.66 sdv 9696.20 1.00 38954.40 2.60 42.40 0.00 0.44 0.00 1.30 4.20 12.57 4.02 2.60 0.07 70.76 sdw 9738.40 1.00 38953.60 2.60 0.00 0.00 0.00 0.00 0.94 4.20 9.13 4.00 2.60 0.07 70.70 sdy 9738.40 1.00 38953.60 2.60 0.00 0.00 0.00 0.00 0.11 4.40 1.05 4.00 2.60 0.07 70.62 sdr 9730.80 1.00 38953.60 2.60 7.60 0.00 0.08 0.00 1.22 4.20 11.87 4.00 2.60 0.07 70.68 sdx 9718.00 1.00 38954.40 2.60 20.60 0.00 0.21 0.00 0.88 4.20 8.57 4.01 2.60 0.07 70.70 sdaa 9738.40 1.00 38953.60 2.60 0.00 0.00 0.00 0.00 0.24 4.20 2.38 4.00 2.60 0.07 70.60 sdz 9738.40 1.00 38953.60 2.60 0.00 0.00 0.00 0.00 0.20 4.20 1.91 4.00 2.60 0.07 70.60 sdab 9633.60 1.00 38953.60 2.60 104.80 0.00 1.08 0.00 1.38 4.20 13.33 4.04 2.60 0.07 70.52 sdac 9639.20 1.00 38954.40 2.60 99.40 0.00 1.02 0.00 1.08 5.60 10.45 4.04 2.60 0.07 70.56 sdad 9536.20 1.00 38954.40 2.60 202.40 0.00 2.08 0.00 2.73 4.00 26.04 4.08 2.60 0.07 70.36 sdaf 9738.60 1.00 38954.40 2.60 0.00 0.00 0.00 0.00 0.37 4.00 3.63 4.00 2.60 0.07 70.64 sdae 9738.60 1.00 38954.40 2.60 0.00 0.00 0.00 0.00 0.16 5.40 1.61 4.00 2.60 0.07 70.72 sdag 9735.20 1.00 38940.80 2.60 0.00 0.00 0.00 0.00 0.46 5.80 4.48 4.00 2.60 0.07 70.76 sdai 9738.60 1.00 38954.40 2.60 0.00 0.00 0.00 0.00 0.31 4.00 3.01 4.00 2.60 0.07 70.60 sdah 9661.60 1.00 38955.20 2.60 77.00 0.00 0.79 0.00 1.51 4.20 14.57 4.03 2.60 0.07 70.70 sdal 9739.20 1.40 38958.40 5.80 0.40 0.40 0.00 22.22 0.27 4.86 2.65 4.00 4.14 0.07 70.80 sdaj 9738.60 1.00 38954.40 2.60 0.00 0.00 0.00 0.00 0.17 4.40 1.68 4.00 2.60 0.07 70.64 sdak 9738.80 1.00 38955.20 2.60 0.00 0.00 0.00 0.00 0.53 5.40 5.21 4.00 2.60 0.07 70.80 Both arrays are running on systems with 32+ cores, 64GB RAM+ with no other load.
Both arrays have stripe_cache_size = 32768.
md0_raid6 process is using 50-75% cpu in both servers.
Each drive in both arrays has >100MB/s sequential read when tested by fio.
10TB hdds is 18 drives array are: TOSHIBA MG06ACA10TE
# blockdev --report RO RA SSZ BSZ StartSec Size Device rw 8192 512 4096 0 10000831348736 /dev/sdg rw 8192 512 4096 2048 1048576 /dev/sdg1 rw 8192 512 512 4096 10000829234688 /dev/sdg2 rw 8192 512 4096 0 10000831348736 /dev/sdb rw 8192 512 4096 2048 1048576 /dev/sdb1 rw 8192 512 4096 4096 17179869184 /dev/sdb2 rw 8192 512 4096 33558528 1074790400 /dev/sdb3 rw 8192 512 4096 35657728 53720645632 /dev/sdb4 rw 8192 512 512 140580864 9928853929472 /dev/sdb5 rw 8192 512 4096 0 10000831348736 /dev/sdf rw 8192 512 4096 2048 1048576 /dev/sdf1 rw 8192 512 512 4096 10000829234688 /dev/sdf2 rw 8192 512 4096 0 10000831348736 /dev/sdr rw 8192 512 4096 2048 1048576 /dev/sdr1 rw 8192 512 512 4096 10000829234688 /dev/sdr2 rw 8192 512 4096 0 10000831348736 /dev/sdm rw 8192 512 4096 2048 1048576 /dev/sdm1 rw 8192 512 512 4096 10000829234688 /dev/sdm2 rw 8192 512 4096 0 10000831348736 /dev/sdi rw 8192 512 4096 2048 1048576 /dev/sdi1 rw 8192 512 512 4096 10000829234688 /dev/sdi2 rw 8192 512 4096 0 10000831348736 /dev/sdh rw 8192 512 4096 2048 1048576 /dev/sdh1 rw 8192 512 512 4096 10000829234688 /dev/sdh2 rw 8192 512 4096 0 10000831348736 /dev/sdj rw 8192 512 4096 2048 1048576 /dev/sdj1 rw 8192 512 512 4096 10000829234688 /dev/sdj2 rw 8192 512 4096 0 10000831348736 /dev/sde rw 8192 512 4096 2048 1048576 /dev/sde1 rw 8192 512 512 4096 10000829234688 /dev/sde2 rw 8192 512 4096 0 10000831348736 /dev/sdn rw 8192 512 4096 2048 1048576 /dev/sdn1 rw 8192 512 512 4096 10000829234688 /dev/sdn2 rw 8192 512 4096 0 10000831348736 /dev/sdl rw 8192 512 4096 2048 1048576 /dev/sdl1 rw 8192 512 512 4096 10000829234688 /dev/sdl2 rw 8192 512 4096 0 10000831348736 /dev/sdp rw 8192 512 4096 2048 1048576 /dev/sdp1 rw 8192 512 512 4096 10000829234688 /dev/sdp2 rw 8192 512 4096 0 10000831348736 /dev/sdq rw 8192 512 4096 2048 1048576 /dev/sdq1 rw 8192 512 512 4096 10000829234688 /dev/sdq2 rw 8192 512 4096 0 10000831348736 /dev/sdk rw 8192 512 4096 2048 1048576 /dev/sdk1 rw 8192 512 512 4096 10000829234688 /dev/sdk2 rw 8192 512 4096 0 10000831348736 /dev/sda rw 8192 512 4096 2048 1048576 /dev/sda1 rw 8192 512 4096 4096 17179869184 /dev/sda2 rw 8192 512 4096 33558528 1074790400 /dev/sda3 rw 8192 512 4096 35657728 53720645632 /dev/sda4 rw 8192 512 512 140580864 9928853929472 /dev/sda5 rw 8192 512 4096 0 10000831348736 /dev/sdd rw 8192 512 4096 2048 1048576 /dev/sdd1 rw 8192 512 512 4096 10000829234688 /dev/sdd2 rw 8192 512 4096 0 10000831348736 /dev/sdc rw 8192 512 4096 2048 1048576 /dev/sdc1 rw 8192 512 512 4096 10000829234688 /dev/sdc2 rw 8192 512 4096 0 10000831348736 /dev/sdo rw 8192 512 4096 2048 1048576 /dev/sdo1 rw 8192 512 512 4096 10000829234688 /dev/sdo2 rw 8192 512 4096 0 1072693248 /dev/md127 rw 8192 512 4096 0 53686042624 /dev/md126 rw 32768 512 4096 0 158859490492416 /dev/md0 14TB hdds in 36 drives array are: WDC WUH721414AL5201
# blockdev --report RO RA SSZ BSZ StartSec Size Device rw 8192 512 4096 0 480103981056 /dev/sda rw 8192 512 4096 2048 535822336 /dev/sda1 rw 8192 512 4096 1048576 536870912 /dev/sda2 rw 8192 512 4096 2097152 447569985536 /dev/sda3 rw 8192 512 4096 876257280 31457280000 /dev/sda4 rw 8192 512 4096 0 480103981056 /dev/sdb rw 8192 512 512 2048 535822336 /dev/sdb1 rw 8192 512 4096 1048576 536870912 /dev/sdb2 rw 8192 512 4096 2097152 447569985536 /dev/sdb3 rw 8192 512 4096 876257280 31457280000 /dev/sdb4 rw 8192 512 512 937698992 2080256 /dev/sdb5 rw 8192 512 4096 0 14000519643136 /dev/sdc rw 8192 512 512 2048 13999981706752 /dev/sdc1 rw 8192 512 4096 0 14000519643136 /dev/sdd rw 8192 512 512 2048 13999981706752 /dev/sdd1 rw 8192 512 4096 0 14000519643136 /dev/sde rw 8192 512 512 2048 13999981706752 /dev/sde1 rw 8192 512 4096 0 14000519643136 /dev/sdf rw 8192 512 512 2048 13999981706752 /dev/sdf1 rw 8192 512 4096 0 14000519643136 /dev/sdg rw 8192 512 512 2048 13999981706752 /dev/sdg1 rw 8192 512 4096 0 14000519643136 /dev/sdh rw 8192 512 512 2048 13999981706752 /dev/sdh1 rw 8192 512 4096 0 14000519643136 /dev/sdi rw 8192 512 512 2048 13999981706752 /dev/sdi1 rw 8192 512 4096 0 14000519643136 /dev/sdj rw 8192 512 512 2048 13999981706752 /dev/sdj1 rw 8192 512 4096 0 536281088 /dev/md2 rw 8192 512 4096 0 14000519643136 /dev/sdk rw 8192 512 512 2048 13999981706752 /dev/sdk1 rw 8192 512 4096 0 14000519643136 /dev/sdl rw 8192 512 512 2048 13999981706752 /dev/sdl1 rw 8192 512 4096 0 447435767808 /dev/md3 rw 8192 512 4096 0 14000519643136 /dev/sdm rw 8192 512 512 2048 13999981706752 /dev/sdm1 rw 69632 512 4096 0 475994761723904 /dev/md0 rw 8192 512 4096 0 14000519643136 /dev/sdo rw 8192 512 512 2048 13999981706752 /dev/sdo1 rw 8192 512 4096 0 14000519643136 /dev/sdp rw 8192 512 512 2048 13999981706752 /dev/sdp1 rw 8192 512 4096 0 14000519643136 /dev/sdn rw 8192 512 512 2048 13999981706752 /dev/sdn1 rw 8192 512 4096 0 14000519643136 /dev/sdt rw 8192 512 512 2048 13999981706752 /dev/sdt1 rw 8192 512 4096 0 14000519643136 /dev/sds rw 8192 512 512 2048 13999981706752 /dev/sds1 rw 8192 512 4096 0 14000519643136 /dev/sdq rw 8192 512 512 2048 13999981706752 /dev/sdq1 rw 8192 512 4096 0 14000519643136 /dev/sdu rw 8192 512 512 2048 13999981706752 /dev/sdu1 rw 8192 512 4096 0 14000519643136 /dev/sdv rw 8192 512 512 2048 13999981706752 /dev/sdv1 rw 8192 512 4096 0 14000519643136 /dev/sdw rw 8192 512 512 2048 13999981706752 /dev/sdw1 rw 8192 512 4096 0 14000519643136 /dev/sdy rw 8192 512 512 2048 13999981706752 /dev/sdy1 rw 8192 512 4096 0 14000519643136 /dev/sdr rw 8192 512 512 2048 13999981706752 /dev/sdr1 rw 8192 512 4096 0 14000519643136 /dev/sdx rw 8192 512 512 2048 13999981706752 /dev/sdx1 rw 8192 512 4096 0 14000519643136 /dev/sdaa rw 8192 512 512 2048 13999981706752 /dev/sdaa1 rw 8192 512 4096 0 14000519643136 /dev/sdz rw 8192 512 512 2048 13999981706752 /dev/sdz1 rw 8192 512 4096 0 14000519643136 /dev/sdab rw 8192 512 512 2048 13999981706752 /dev/sdab1 rw 8192 512 4096 0 14000519643136 /dev/sdac rw 8192 512 512 2048 13999981706752 /dev/sdac1 rw 8192 512 4096 0 14000519643136 /dev/sdad rw 8192 512 512 2048 13999981706752 /dev/sdad1 rw 8192 512 4096 0 14000519643136 /dev/sdaf rw 8192 512 512 2048 13999981706752 /dev/sdaf1 rw 8192 512 4096 0 14000519643136 /dev/sdae rw 8192 512 512 2048 13999981706752 /dev/sdae1 rw 8192 512 4096 0 14000519643136 /dev/sdag rw 8192 512 512 2048 13999981706752 /dev/sdag1 rw 8192 512 4096 0 14000519643136 /dev/sdai rw 8192 512 512 2048 13999981706752 /dev/sdai1 rw 8192 512 4096 0 14000519643136 /dev/sdah rw 8192 512 512 2048 13999981706752 /dev/sdah1 rw 8192 512 4096 0 14000519643136 /dev/sdal rw 8192 512 512 2048 13999981706752 /dev/sdal1 rw 8192 512 4096 0 14000519643136 /dev/sdaj rw 8192 512 512 2048 13999981706752 /dev/sdaj1 rw 8192 512 4096 0 14000519643136 /dev/sdak rw 8192 512 512 2048 13999981706752 /dev/sdak1 On both arrays sync_speed_min/sync_speed_max is set to 200000.
18drives array is connected as JBOD via LSI SAS3008 PCI-Express Fusion-MPT SAS-3
36 drives array is connected as JBOD via two controllers LSI SAS3008 PCI-Express Fusion-MPT SAS-3
All controllers are in PCI-E 3.0 x8 slots: LnkSta: Speed 8GT/s (ok), Width x8 (ok)
My questions are:
- Why array with 36drives has almost two times slower resync rate ?
- iostat for array with 18 drives shows: 322.00 r/s 77282.40 rKB/s 16454.40 rrqm/s
but iostat for array with 36 drives shows: 9738.60 r/s 38956.00 rKB/s 0 rrqm/s
why second array is not doing io merge ?
- Is there anything i can try to speed up resync on second array ?
UPDATE
I was able to speed up 18drive array from 70MB/s to 180MB/s by increasing number of threads in mdraid:
echo 8 > /sys/block/md0/md/group_thread_cnt what is even more intresting - doing the same on 36drive array resulted in decreasing performance from 40MB/s to 30MB/s.
UPDATE 2
Just noticed that rareq-sz from iostat on 36 drive array is only 4KB. It looks like all IO send to disks is always only 4KB. This is really strange. Why md raid is doing resync in 4KB chunks for this array ?
UPDATE 3
I have done a bit more research on 24 NVMe drives server and found that resync speed bottleneck affect RAID6 with >16 drives:
# mdadm --create --verbose /dev/md0 --level=6 --raid-devices=16 /dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1 /dev/nvme5n1 /dev/nvme6n1 /dev/nvme7n1 /dev/nvme8n1 /dev/nvme9n1 /dev/nvme10n1 /dev/nvme11n1 /dev/nvme12n1 /dev/nvme13n1 /dev/nvme14n1 /dev/nvme15n1 /dev/nvme16n1 # iostat -dx 5 Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util nvme0n1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 nvme1n1 342.60 0.40 161311.20 0.90 39996.60 0.00 99.15 0.00 2.88 0.00 0.99 470.84 2.25 2.51 86.04 nvme4n1 342.60 0.40 161311.20 0.90 39996.60 0.00 99.15 0.00 2.89 0.00 0.99 470.84 2.25 2.51 86.06 nvme5n1 342.60 0.40 161311.20 0.90 39996.60 0.00 99.15 0.00 2.89 0.00 0.99 470.84 2.25 2.51 86.14 nvme10n1 342.60 0.40 161311.20 0.90 39996.60 0.00 99.15 0.00 2.90 0.00 0.99 470.84 2.25 2.51 86.20 as you can see, there are 342 iops with ~470 rareq-sz, but when i create RAID6 with 17 drives or more:
# mdadm --create --verbose /dev/md0 --level=6 --raid-devices=17 /dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1 /dev/nvme5n1 /dev/nvme6n1 /dev/nvme7n1 /dev/nvme8n1 /dev/nvme9n1 /dev/nvme10n1 /dev/nvme11n1 /dev/nvme12n1 /dev/nvme13n1 /dev/nvme14n1 /dev/nvme15n1 /dev/nvme16n1 /dev/nvme17n1 # iostat -dx 5 Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util nvme0n1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 nvme1n1 21484.20 0.40 85936.80 0.90 0.00 0.00 0.00 0.00 0.04 0.00 0.82 4.00 2.25 0.05 99.16 nvme4n1 21484.00 0.40 85936.00 0.90 0.00 0.00 0.00 0.00 0.03 0.00 0.74 4.00 2.25 0.05 99.16 nvme5n1 21484.00 0.40 85936.00 0.90 0.00 0.00 0.00 0.00 0.04 0.00 0.84 4.00 2.25 0.05 99.16 rareq-sz drops to 4, iops increase to 21483 and resync speed drops to 85MB/s.
Why is it like that?
Could someone let me know which part of mdraid kernel code is responsible for this limitation ?
sysctl dev.raid.speed_limit_minandsysctl dev.raid.speed_limit_maxand ONLY for the rebuildmdadm --grow --bitmap=internal /dev/md0and to revertmdadm --grow --bitmap=none /dev/md0