1

I found that when the system goes out of memory, the disk IO usage will be very high.

It seems many processes are reading from hard disk frantically (check htop output below). When I kill a process that use too much memory, free some memory for system. The IO usage decreases to a normal state.

The problem can be reproduced by writing a program that consumes a lot of memory until there is not enough memory on machine. When you killed that running program, everything goes back to normal.

I am aware of the os mechanism of swap. But it seems the swap was not even used for the whole time (check free and vmstat output below).

❯ free -h total used free shared buff/cache available Mem: 859Mi 692Mi 60Mi 25Mi 106Mi 36Mi Swap: 0B 0B 0B 
❯ htop PID RES SHR CPU% MEM% TIME+ DISK READ DISK WRITE DISK R/W Command 6386 37316 5380 0.7 4.2 10:40.07 14.96 M/s 0.00 B/s 14.96 M/s ahdbserver-1.3.2-SNAPSH 23252 17880 15748 0.0 2.0 0:01.24 7.91 M/s 0.00 B/s 7.91 M/s postgres -D /var/lib/po 29428 400 0 0.0 0.0 0:02.63 3.36 M/s 2.63 K/s 3.36 M/s sgagent -d 2369 197M 0 0.0 23.0 0:01.00 1.86 M/s 0.00 B/s 1.86 M/s java -jar memtest-1.0-S 24596 10820 0 0.0 1.2 0:59.53 694.74 K/s 0.00 B/s 694.74 K/s frps -c frps.ini 22901 122M 0 2.0 14.2 1:15.23 644.74 K/s 0.00 B/s 644.74 K/s srcds_linux -game dod - 8735 2016 52 0.7 0.2 1:46.21 344.74 K/s 0.00 B/s 344.74 K/s htop 2959 4664 176 0.0 0.5 15:35.06 318.42 K/s 0.00 B/s 318.42 K/s tmux 23265 18160 14344 0.0 2.1 0:01.30 286.84 K/s 0.00 B/s 286.84 K/s postgres: 11/main: post 23264 7036 3992 0.0 0.8 0:00.03 78.95 K/s 0.00 B/s 78.95 K/s postgres: 11/main: Time 23262 7160 4116 0.0 0.8 0:00.04 71.05 K/s 0.00 B/s 71.05 K/ 
❯ vmstat 2 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 1 0 0 68588 2096 103156 0 0 28436 0 1787 4288 2 8 79 12 0 0 1 0 57564 920 115364 0 0 24604 86 1676 3811 2 4 81 14 0 0 0 0 70252 1156 102360 0 0 31750 0 1794 4337 3 8 75 15 0 1 0 0 68632 2776 101380 0 0 38570 16 2139 4879 2 11 67 19 0 0 0 0 67656 892 104940 0 0 29356 14 1706 3936 3 5 77 15 0 0 0 0 68596 372 103368 0 0 50684 0 2324 5078 3 11 70 16 0 0 0 0 69596 268 102512 0 0 35688 38 1890 4282 2 8 76 15 0 0 1 0 69368 172 102540 0 0 35726 54 1877 4458 2 9 71 19 0 0 1 0 69684 1912 100916 0 0 28724 0 1759 4235 3 7 74 16 0 0 0 0 74380 768 97076 0 0 21198 0 1484 3762 2 5 80 13 0 
❯ lsb_release -a No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 10 (buster) Release: 10 Codename: buster 

What are the reasons?

Thanks.

2
  • 1
    Try to run the perf top. Commented Sep 10, 2020 at 16:50
  • I've been chasing similar issues for a while now. At first I assumed it was thrashing but there is no paging activity, but huge block input counts. I find it hard to believe it's insufficient cache, as the system still has over 2 GiB of cache, but suddenly many processes are busy doing disk reads (that previously weren't). perf top shows over 10% time in page_vma_mapped_walk. Commented Jan 11, 2023 at 5:17

2 Answers 2

5

When memory usage increases, the available (free) memory for cache and buffers decreases. With less cache memory more data must be loaded from disk, because it is uncached.

This will, of course, lead to more i/o activity.

0

Just to rephrase Andreas Rogge answer.

Due to low memory, the filesystem cache is too small and permanently invalidated. This increases the load on IO subsystem, as any request passed to disks.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.