I recently searched and read lots of posts and questions about Linux memory management but I can't find my case. For example, there is a question in Unix StackExchange about High memory usage but no process is using it. In this post, the accepted answer describes virtual memory ballooning. In my case, no ballooning exist. The output of vmware-toolbox-cmd stat balloon in all nodes is 0 MB
I have a simple Hadoop cluster with CDH on three virtualized Oracle Linux VMs. All things configured Correctly. All Cloudera installation tests passed and no warning on Cloudera manager.
On a fresh reboot condition, All things are good. after reboot, all nodes memory usage is reasonable. I start MySQL database service for Cloudera manager and other CDH services. also, I start all installed CDH services in these three nodes. after a while, I see some unreasonable memory usage in all the nodes. It takes near one month to get this condition but I think something is wrong.
I stopped all the services in all nodes including installed CDH services, Cloudera manager services, and MySQL database. I see only some Oracle Linux native process like gnome-shell in the top, but there is a High Active(file) memory usage in /proc/meminfo. Notice no active process use it. In the following, I write the output of some Linux commands on all three nodes.
First node with 72 GByte memory:
# top
top - 10:07:29 up 43 days, 22:10, 1 user, load average: 0.00, 0.01, 0.05 Tasks: 335 total, 1 running, 334 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.4 us, 0.4 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 73964544 total, 62105912 free, 10447080 used, 1411556 buff/cache KiB Swap: 31457276 total, 31457276 free, 0 used. 72295264 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 9284 gdm 20 0 4493724 168588 80536 S 0.0 0.2 1:09.89 gnome-shell 657 root 20 0 211512 139828 139328 S 0.0 0.2 2:02.12 systemd-journal 1321 root 20 0 417540 114112 112620 S 0.0 0.2 3:38.10 rsyslogd 9184 root 20 0 417184 30108 21796 S 0.0 0.0 0:01.46 X 1808 root 20 0 687472 25908 12760 S 0.0 0.0 23:24.98 python2.7 9404 gdm 20 0 1091248 24988 19520 S 0.0 0.0 0:00.60 gsd-media-keys 9406 gdm 20 0 651064 23064 17928 S 0.0 0.0 0:00.82 gsd-power 9387 gdm 20 0 634648 22980 17756 S 0.0 0.0 0:00.20 gsd-xsettings 9396 gdm 20 0 657552 22976 17972 S 0.0 0.0 0:27.73 gsd-color 9361 gdm 20 0 484424 22492 17656 S 0.0 0.0 0:00.15 ibus-x11 9385 gdm 20 0 566536 22368 17348 S 0.0 0.0 0:00.19 gsd-wacom 9389 gdm 20 0 570760 22276 17360 S 0.0 0.0 0:00.17 gsd-a11y-keyboa 9402 gdm 20 0 642340 22220 17376 S 0.0 0.0 0:00.17 gsd-keyboard 9393 gdm 20 0 484044 21852 17016 S 0.0 0.0 0:00.16 gsd-clipboard 1059 polkitd 20 0 546320 19184 9708 S 0.0 0.0 10:11.73 polkitd 1883 root 20 0 224860 18236 3772 S 0.0 0.0 105:21.35 python 1982 root 20 0 212752 15816 7396 S 0.0 0.0 0:00.17 python2.7 9419 gdm 20 0 721080 15256 13260 S 0.0 0.0 0:59.14 gsd-smartcard 9214 gdm 20 0 696804 14056 11804 S 0.0 0.0 0:00.96 gnome-session-b 1045 root 20 0 480552 14044 11728 S 0.0 0.0 0:29.82 NetworkManager 9421 gdm 20 0 577744 13312 11448 S 0.0 0.0 0:00.03 gsd-sound 9397 gdm 20 0 495504 13056 11212 S 0.0 0.0 0:00.03 gsd-datetime 9407 gdm 20 0 484744 12320 10572 S 0.0 0.0 0:00.03 gsd-print-notif # ps aux | awk '{sum+=$6} END {print sum / 1024}'
1163.8 # free -m
total used free shared buff/cache available Mem: 72231 10201 60650 550 1378 70601 Swap: 30719 0 30719 # cat /proc/meminfo
MemTotal: 73964548 kB MemFree: 62105864 kB MemAvailable: 72295336 kB Buffers: 1036 kB Cached: 758788 kB SwapCached: 0 kB Active: 7807136 kB Inactive: 2962332 kB Active(anon): 281104 kB Inactive(anon): 547168 kB Active(file): 7526032 kB Inactive(file): 2415164 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 31457276 kB SwapFree: 31457276 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 264356 kB Mapped: 208956 kB Shmem: 563972 kB Slab: 651816 kB SReclaimable: 573188 kB SUnreclaim: 78628 kB KernelStack: 8592 kB PageTables: 22452 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 68439548 kB Committed_AS: 2689848 kB VmallocTotal: 34359738367 kB VmallocUsed: 370120 kB VmallocChunk: 34321002492 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 210880 kB DirectMap2M: 19712000 kB DirectMap1G: 57671680 kB Second node with 36 GByte memory:
# top
top - 10:07:58 up 43 days, 22:10, 1 user, load average: 0.00, 0.01, 0.05 Tasks: 336 total, 1 running, 335 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.4 us, 0.4 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 36807704 total, 29874328 free, 5905568 used, 1027808 buff/cache KiB Swap: 31457276 total, 31457276 free, 0 used. 35350592 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1865 gdm 20 0 4561224 251136 82240 S 0.0 0.7 29:28.13 gnome-shell 649 root 20 0 211356 138832 138340 S 0.0 0.4 2:03.48 systemd-journal 1341 root 20 0 417384 112980 111484 S 0.0 0.3 3:18.25 rsyslogd 2229 gdm 20 0 735040 33768 17968 S 0.0 0.1 103:09.50 gsd-color 1683 root 20 0 417204 30532 22216 S 0.0 0.1 0:23.65 X 29070 root 20 0 687476 25824 12676 S 0.0 0.1 10:43.35 python2.7 2241 gdm 20 0 1017516 24812 19372 S 0.0 0.1 0:10.19 gsd-media-keys 2243 gdm 20 0 651068 23196 18064 S 0.0 0.1 0:15.36 gsd-power 2219 gdm 20 0 634644 22872 17668 S 0.0 0.1 0:00.22 gsd-xsettings 2152 gdm 20 0 484424 22360 17536 S 0.0 0.1 0:00.15 ibus-x11 2215 gdm 20 0 566536 22240 17236 S 0.0 0.1 0:00.18 gsd-wacom 2240 gdm 20 0 642336 22164 17348 S 0.0 0.1 0:00.18 gsd-keyboard 2222 gdm 20 0 570756 21880 16976 S 0.0 0.1 0:00.17 gsd-a11y-keyboa 2225 gdm 20 0 484044 21668 16852 S 0.0 0.1 0:00.16 gsd-clipboard 1093 polkitd 20 0 546320 19000 9556 S 0.0 0.1 10:19.18 polkitd 1717 root 20 0 223220 16620 3812 S 0.0 0.0 44:34.87 python 1731 root 20 0 212752 15832 7416 S 0.0 0.0 0:00.71 python2.7 2257 gdm 20 0 721208 15032 13040 S 0.0 0.0 25:09.76 gsd-smartcard 1750 gdm 20 0 696800 13964 11712 S 0.0 0.0 0:34.28 gnome-session-b 1071 root 20 0 480520 13844 11500 S 0.0 0.0 0:31.47 NetworkManager 2258 gdm 20 0 577744 13148 11276 S 0.0 0.0 0:00.04 gsd-sound 2235 gdm 20 0 495504 12872 11028 S 0.0 0.0 0:00.03 gsd-datetime 2245 gdm 20 0 484744 12176 10424 S 0.0 0.0 0:00.03 gsd-print-notif # ps aux | awk '{sum+=$6} END {print sum / 1024}'
1247.77 # free -m
total used free shared buff/cache available Mem: 35945 5765 29175 511 1003 34523 Swap: 30719 0 30719 # cat /proc/meminfo
MemTotal: 36807704 kB MemFree: 29875428 kB MemAvailable: 35351836 kB Buffers: 0 kB Cached: 691300 kB SwapCached: 0 kB Active: 5482728 kB Inactive: 785216 kB Active(anon): 369828 kB Inactive(anon): 506792 kB Active(file): 5112900 kB Inactive(file): 278424 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 31457276 kB SwapFree: 31457276 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 353336 kB Mapped: 210668 kB Shmem: 523288 kB Slab: 336724 kB SReclaimable: 268132 kB SUnreclaim: 68592 kB KernelStack: 8592 kB PageTables: 22688 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 49861128 kB Committed_AS: 2750840 kB VmallocTotal: 34359738367 kB VmallocUsed: 268540 kB VmallocChunk: 34340171772 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 180160 kB DirectMap2M: 9256960 kB DirectMap1G: 30408704 kB Third node with 24 GByte memory:
# top
top - 10:08:01 up 43 days, 22:11, 1 user, load average: 0.00, 0.01, 0.05 Tasks: 335 total, 1 running, 334 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.4 us, 0.7 sy, 0.0 ni, 98.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 24422436 total, 17590640 free, 5837736 used, 994060 buff/cache KiB Swap: 31457276 total, 31457276 free, 0 used. 23084448 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1820 gdm 20 0 4556244 244876 82152 S 0.0 1.0 28:58.89 gnome-shell 647 root 20 0 211328 139016 138516 S 0.0 0.6 2:01.63 systemd-journal 1330 root 20 0 417356 112584 111100 S 0.0 0.5 2:56.85 rsyslogd 2084 gdm 20 0 661304 33656 17880 S 0.0 0.1 90:41.83 gsd-color 1669 root 20 0 417184 30612 22296 S 0.0 0.1 0:23.21 X 1838 root 20 0 687472 25684 12540 S 0.0 0.1 22:38.69 python2.7 2094 gdm 20 0 1230516 24960 19484 S 0.0 0.1 0:10.22 gsd-media-keys 2098 gdm 20 0 651064 23148 18020 S 0.0 0.1 0:15.32 gsd-power 2076 gdm 20 0 634648 23000 17804 S 0.0 0.1 0:00.20 gsd-xsettings 2074 gdm 20 0 566536 22348 17348 S 0.0 0.1 0:00.21 gsd-wacom 2079 gdm 20 0 570756 22080 17180 S 0.0 0.1 0:00.16 gsd-a11y-keyboa 2035 gdm 20 0 484424 22024 17204 S 0.0 0.1 0:00.14 ibus-x11 2093 gdm 20 0 642336 21904 17088 S 0.0 0.1 0:00.17 gsd-keyboard 2083 gdm 20 0 484044 21816 16996 S 0.0 0.1 0:00.18 gsd-clipboard 1071 polkitd 20 0 546312 18812 9372 S 0.0 0.1 10:10.08 polkitd 1719 root 20 0 222800 16056 3708 S 0.0 0.1 31:42.76 python 1724 root 20 0 212752 15692 7276 S 0.0 0.1 0:00.62 python2.7 2111 gdm 20 0 721080 15116 13120 S 0.0 0.1 23:53.20 gsd-smartcard 1714 gdm 20 0 696800 13876 11620 S 0.0 0.1 0:34.06 gnome-session-b 1049 root 20 0 480512 13872 11580 S 0.0 0.1 0:30.41 NetworkManager 2115 gdm 20 0 577744 12984 11116 S 0.0 0.1 0:00.03 gsd-sound 2085 gdm 20 0 495504 12848 11004 S 0.0 0.1 0:00.03 gsd-datetime 2099 gdm 20 0 484744 12052 10304 S 0.0 0.0 0:00.02 gsd-print-notif # ps aux | awk '{sum+=$6} END {print sum / 1024}'
1240.27 # free -m
total used free shared buff/cache available Mem: 23850 5701 17177 507 970 22542 Swap: 30719 0 30719 # cat /proc/meminfo
MemTotal: 24422436 kB MemFree: 17590524 kB MemAvailable: 23084532 kB Buffers: 0 kB Cached: 687492 kB SwapCached: 0 kB Active: 5470816 kB Inactive: 780568 kB Active(anon): 361540 kB Inactive(anon): 504624 kB Active(file): 5109276 kB Inactive(file): 275944 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 31457276 kB SwapFree: 31457276 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 346348 kB Mapped: 210440 kB Shmem: 519888 kB Slab: 306796 kB SReclaimable: 241160 kB SUnreclaim: 65636 kB KernelStack: 8592 kB PageTables: 22624 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 43668492 kB Committed_AS: 2753904 kB VmallocTotal: 34359738367 kB VmallocUsed: 215184 kB VmallocChunk: 34346561532 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 180160 kB DirectMap2M: 7159808 kB DirectMap1G: 19922944 kB As you can see Active(file) is the main memory usage field. I read about Active(file) in this question. As mentioned in the answer, active is the total amount of buffer or page cache memory, in kilobytes, that is in active use. This is the memory that has been recently used and is usually not reclaimed for other purposes.
As I said all services are down. I want to release this memory. while the cluster is running, this memory increase day by day. All nodes are virtualized Oracle Linux 7.5 on VMware ESXi.
I will appreciate someone gives me some point to solve this problem.