We have an influxdb VM that is constantly under 100% swap. Even if we restart the VM, the swap usage reaches 100% in about 20 minutes. However, memory usage is only about 50%. (The VM has 32 CPU Cores and 128 GB of Memory.)
Running free -h:
total used free shared buff/cache available Mem: 123Gi 70Gi 567Mi 551Mi 52Gi 59Gi Swap: 9Gi 9Gi 0B Shows that we have at least 59GB of memory and 100% of the swap is still used.
If we run atop we see that the disk is 100% busy (swap and disk are red)
SWP | tot 10.0G | | free 0.0M | swcac 505.9M DSK | nvme2n1 | busy 100% | read 33115 | write 527 | discrd 0 | KiB/r 19 | KiB/w 173 | | KiB/d 0 | MBr/s 63.3 | MBw/s 8.9 | avq 88.19 | avio 0.30 ms This I'm guessing is the constant inflow of data-events.... (But why is read high then?)
Memory and I/O pressure from PSI:
cat /proc/pressure/memory some avg10=32.65 avg60=32.74 avg300=31.25 total=35534063966 full avg10=32.25 avg60=32.34 avg300=30.87 total=35182532561 cat /proc/pressure/io some avg10=84.83 avg60=78.83 avg300=78.96 total=70337558807 full avg10=84.38 avg60=78.05 avg300=78.08 total=69619870053 Memory pressure doesn't seem high but IO pressure is.
Running iotop it is clear that the disk activity is from influxdb:
4272 be/3 root 0.00 B/s 94.47 K/s ?unavailable? [jbd2/nvme2n1p1-8] 36921 be/2 vcap 1169.95 K/s 0.00 B/s ?unavailable? influxd -config /var/vcap/jobs/influxdb/config/influxdb.conf -pidfile /var/vcap/sys/run/influxdb/influxdb.pid 36927 be/2 vcap 323.37 K/s 0.00 B/s ?unavailable? influxd -config /var/vcap/jobs/influxdb/config/influxdb.conf -pidfile /var/vcap/sys/run/influxdb/influxdb.pid 36928 be/2 vcap 2038.33 K/s 0.00 B/s ?unavailable? influxd -config /var/vcap/jobs/influxdb/config/influxdb.conf -pidfile /var/vcap/sys/run/influxdb/influxdb.pid 36941 be/2 vcap 1936.59 K/s 0.00 B/s ?unavailable? influxd -config /var/vcap/jobs/influxdb/config/influxdb.conf -pidfile /var/vcap/sys/run/influxdb/influxdb.pid 37020 be/2 vcap 385.14 K/s 0.00 B/s ?unavailable? influxd -config /var/vcap/jobs/influxdb/config/influxdb.conf -pidfile /var/vcap/sys/run/influxdb/influxdb.pid . . . . . . . . (Lots of influx threads) SAR output
sar -d 10 6 Linux 6.2.0-39-generic (ac2f95dd-14d9-4eed-8e2f-060615e24dce) 03/24/2024 _x86_64_ (32 CPU) 06:45:57 AM DEV tps rkB/s wkB/s dkB/s areq-sz aqu-sz await %util 06:46:07 AM nvme1n1 0.30 12.80 1.60 0.00 48.00 0.00 1.33 0.12 06:46:07 AM nvme0n1 0.30 0.00 3.20 0.00 10.67 0.00 1.00 0.12 06:46:07 AM nvme2n1 3420.80 67438.40 3687.20 0.00 20.79 106.47 31.13 100.00 06:46:07 AM DEV tps rkB/s wkB/s dkB/s areq-sz aqu-sz await %util 06:46:17 AM nvme1n1 1.00 0.00 9.20 0.00 9.20 0.00 0.90 0.16 06:46:17 AM nvme0n1 0.90 16.00 9.60 0.00 28.44 0.00 0.67 0.20 06:46:17 AM nvme2n1 3404.80 68434.40 7868.00 0.00 22.41 102.23 30.03 100.00 06:46:17 AM DEV tps rkB/s wkB/s dkB/s areq-sz aqu-sz await %util 06:46:27 AM nvme1n1 9.70 26.40 20.40 0.00 4.82 0.02 1.69 1.24 06:46:27 AM nvme0n1 0.30 0.00 4.40 0.00 14.67 0.00 0.67 0.08 06:46:27 AM nvme2n1 3215.40 46037.20 12006.40 0.00 18.05 66.12 20.56 100.00 ^C Average: DEV tps rkB/s wkB/s dkB/s areq-sz aqu-sz await %util Average: nvme1n1 3.67 13.07 10.40 0.00 6.40 0.01 1.61 0.51 Average: nvme0n1 0.50 5.33 5.73 0.00 22.13 0.00 0.73 0.13 Average: nvme2n1 3347.00 60636.67 7853.87 0.00 20.46 91.61 27.37 100.00 Running queries in influxdb:
It seems like this swap issue is even when queries arent running?
> show queries qid query database duration status --- ----- -------- -------- ------ 265 SHOW QUERIES metrics 53µs running vmstat output:
vmstat 1 5 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 32 10485756 541300 8784 108148928 11 140 3563 194 76 217 1 1 58 40 0 0 32 10485756 638500 8764 108060800 0 0 128216 60 5181 3351 0 1 59 40 0 1 31 10485756 505964 8780 108189872 0 0 128252 256 5077 3769 0 1 54 45 0 0 32 10485756 663736 8744 108035424 0 0 128332 0 5047 3327 0 1 50 50 0 0 32 10485756 536476 8752 108164376 0 0 127776 24 4087 3335 0 0 53 46 0 /proc/meminfo is
MemTotal: 129202084 kB MemFree: 486060 kB MemAvailable: 71279440 kB Buffers: 24116 kB Cached: 59442056 kB SwapCached: 489676 kB Active: 51318648 kB Inactive: 75364416 kB Active(anon): 27646572 kB Inactive(anon): 28055976 kB Active(file): 23672076 kB Inactive(file): 47308440 kB Unevictable: 24 kB Mlocked: 24 kB SwapTotal: 10485756 kB SwapFree: 4 kB Zswap: 0 kB Zswapped: 0 kB Dirty: 102236 kB Writeback: 6156 kB AnonPages: 66728116 kB Mapped: 43055064 kB Shmem: 127816 kB KReclaimable: 855024 kB Slab: 971400 kB SReclaimable: 855024 kB SUnreclaim: 116376 kB KernelStack: 10976 kB PageTables: 747920 kB SecPageTables: 0 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 75086796 kB Committed_AS: 95698296 kB VmallocTotal: 34359738367 kB VmallocUsed: 151392 kB VmallocChunk: 0 kB Percpu: 17920 kB HardwareCorrupted: 0 kB AnonHugePages: 7997440 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB FileHugePages: 0 kB FilePmdMapped: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB DirectMap4k: 202656 kB DirectMap2M: 6404096 kB DirectMap1G: 124780544 kB I am also adding some excerpts of the pmap -x command:
Address Kbytes RSS Dirty Mode Mapping 0000000000400000 15232 3684 0 r-x-- influxd 00000000012e0000 31428 6552 0 r---- influxd 0000000003191000 4668 4380 396 rw--- influxd 0000000003620000 180 92 92 rw--- [ anon ] 0000000004436000 132 0 0 rw--- [ anon ] 000000c000000000 16384 9864 9864 rw--- [ anon ] 000000c001000000 47104 28172 28172 rw--- [ anon ] 000000c003e00000 6144 5016 5016 rw--- [ anon ] 000000c004400000 2048 1616 1616 rw--- [ anon ] 000000c004600000 2048 1620 1620 rw--- [ anon ] . . . 000000c033a00000 155648 120028 120028 rw--- [ anon ] 000000c03d200000 8192 8192 8192 rw--- [ anon ] 000000c03da00000 114688 92768 92768 rw--- [ anon ] . . . 000000c07d000000 270336 234948 234948 rw--- [ anon ] . 000000cecc000000 176128 174080 174080 rw--- [ anon ] . . 000000ced8e00000 2048 2048 2048 rw--- [ anon ] 000000ced9000000 137216 135168 135168 rw--- [ anon ] . . (Towrds the lower) . . 00007fa61fdef000 2116 2044 2044 rw--- [ anon ] 00007fa620000000 9664 0 0 r--s- L3-00000023.tsi 00007fa620a00000 40048 0 0 r--s- L5-00000032.tsi 00007fa623200000 40212 0 0 r--s- L5-00000032.tsi . . . 00007fa6a2c00000 9772 0 0 r--s- L3-00000023.tsi 00007fa6a3600000 2098160 0 0 r--s- 000024596-000000002.tsm 00007fa723800000 9920 0 0 r--s- L3-00000023.tsi 00007fa724200000 615764 0 0 r--s- 000024596-000000005.tsm 00007fa749c00000 2100756 0 0 r--s- 000024596-000000004.tsm 00007fa7ca000000 9768 0 0 r--s- L3-00000023.tsi . . . 00007fce82403000 28660 5412 5412 rw--- [ anon ] 00007fce84000000 4194308 2575504 0 r--s- index 00007fcf84001000 4 0 0 r--s- L0-00000001.tsl 00007fcf84002000 4 0 0 r--s- L0-00000001.tsl 00007fcf84003000 4 0 0 r--s- L0-00000001.tsl . . 00007fcfc48f7000 1060 0 0 r--s- L0-00000002.tsl 00007fcfc4a00000 262144 35444 0 r--s- 0046 00007fcfd4a00000 2048 1988 1988 rw--- [ anon ] 00007fcfd4c00000 262144 35948 0 r--s- 0045 . . 00007fd055a00000 4 0 0 r--s- L0-00000001.tsl 00007fd055a01000 4 0 0 r--s- L0-00000001.tsl 00007fd055a02000 4 0 0 r--s- L0-00000001.tsl . . 00007fd065c0f000 960 924 924 rw--- [ anon ] 00007fd065cff000 1028 0 0 r--s- L0-00000005.tsl 00007fd065e00000 262144 31952 0 r--s- 003c . . 00007fda27fee000 8192 8 8 rw--- [ anon ] 00007fda287ee000 4 0 0 ----- [ anon ] 00007fda287ef000 43076 1164 1164 rw--- [ anon ] 00007fda2b200000 160 160 0 r---- libc.so.6 00007fda2b228000 1620 780 0 r-x-- libc.so.6 00007fda2b3bd000 352 64 0 r---- libc.so.6 00007fda2b415000 16 0 0 r---- libc.so.6 00007fda2b419000 8 0 0 rw--- libc.so.6 00007fda2b41b000 52 0 0 rw--- [ anon ] 00007fda2b428000 4 0 0 r--s- L0-00000001.tsl 00007fda2b429000 4 0 0 r--s- L0-00000001.tsl 00007fda2b42a000 4 0 0 r--s- L0-00000001.tsl 00007fda2b42b000 4 0 0 r--s- L0-00000001.tsl 00007fda2b42c000 4 0 0 r--s- L0-00000001.tsl 00007fda2b42d000 4 0 0 r--s- L0-00000001.tsl 00007fda2b42e000 452 452 452 rw--- [ anon ] 00007fda2b49f000 16 0 0 r--s- L0-00000018.tsl 00007fda2b4af000 268 112 112 rw--- [ anon ] 00007fda2b4f2000 4 0 0 r---- libpthread.so.0 00007fda2b4f3000 4 0 0 r-x-- libpthread.so.0 00007fda2b4f4000 4 0 0 r---- libpthread.so.0 00007fda2b4f5000 4 0 0 r---- libpthread.so.0 00007fda2b4f6000 4 0 0 rw--- libpthread.so.0 00007fda2b4f7000 4 0 0 r--s- L0-00000001.tsl 00007fda2b4f8000 8 0 0 r--s- L0-00000001.tsl 00007fda2b4fa000 4 0 0 r--s- L0-00000001.tsl 00007fda2b4fb000 8 0 0 rw--- [ anon ] 00007fda2b4fd000 8 8 0 r---- ld-linux-x86-64.so.2 00007fda2b4ff000 168 168 0 r-x-- ld-linux-x86-64.so.2 00007fda2b529000 44 40 0 r---- ld-linux-x86-64.so.2 00007fda2b534000 4 0 0 r--s- L0-00000001.tsl 00007fda2b535000 8 0 0 r---- ld-linux-x86-64.so.2 00007fda2b537000 8 0 0 rw--- ld-linux-x86-64.so.2 00007fff74913000 132 12 12 rw--- [ stack ] 00007fff7499b000 16 0 0 r---- [ anon ] 00007fff7499f000 8 4 0 r-x-- [ anon ] ffffffffff600000 4 0 0 --x-- [ anon ] ---------------- ------- ------- ------- total kB 534464172 112696540 74590512 The series cardinality is 252390866 (So is the VM size in-adequate?)
VM details: Influxdb: 1.8.10 CPU Count: 32 Memory: 128 GB Disk: 1TB (Only 50% used) AWS VM Type: m6a.8xlarge (32CPU,128GB Memory)... EBS IO is 10GBps based on this https://aws.amazon.com/ec2/instance-types/m6a/ Linux Version: Linux 6.2.0-39-generic #40~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 16 10:53:04 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
The swapiness of the VM is 60 (default). (What does this mean? Initially, I thought it was a percentage but apparently, it's an absolute number?)
How do we debug this disk usage and also if the IOPS has reached its limits? And what is causing so much read rather than write?
Update Vm size was increased to 2x in memory:
Observations
vmstat:
and meminfo: MemFree: 9436328 kB MemAvailable: 246346788 kB Buffers: 829708 kB Cached: 171495864 kB SwapCached: 124960 kB Active: 78087852 kB Inactive: 167324320 kB Active(anon): 6396424 kB Inactive(anon): 2389588 kB Active(file): 71691428 kB Inactive(file): 164934732 kB vmstat
vmstat procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 3 0 2379520 10251664 835112 172756112 1 2 196 596 7 4 2 0 93 5 0 Disk usage atop has significantly reduced to 20%
DSK | nvme2n1 | busy 20% | read 51 | write 2103 | discrd 0 | KiB/r 18 | KiB/w 165 | | KiB/d 0 | MBr/s 0.1 | MBw/s 34.0 | avq 13.95 | avio 0.94 ms
vmstatoutput.