4

I am taring and then compressing a bunch of files&directories on my Ubuntu Server VPS for a backup. It only has 1GB of RAM and 128MB of Swap (I can't add more - OVH use OpenVZ as their virtualisation software), and every time tar runs it uses a ton of memory for it's buffer, causing everything else to get swapped out - even when using nice -n 10.

Is there any way to force tar to use a small buffer and reduce it's memory usage? I am worried that once the backup gets to be a certain size, my server will go down because tar won't have enough memory for it's buffer.

I am using bzip2 to compress, and I have already limited it's memory usage with the -4 option.

Edit: Here is what htop looks like when I have had tar running for a while:

enter image description here

Here is a link to the full gif

Edit 2: Here is the tar command I am using:

nice -n 20 tar --exclude "*node_modules*" --exclude "*.git/*" --exclude "/srv/www-mail/rainloop/v*" -cf archive.tar /home /var/log /var/mail /srv /etc 
11
  • How do you see that tar is using much memory? I guess it just causes linux to remove useful "hot" data from its cache and replace it with useless "cold" data which are being backup up (and not needed in the cache) Commented Jul 8, 2015 at 21:23
  • @Marki555 I used htop to observe my memory and swap usage. I used this tutorial to view which proecesses were using the most swap before and after, and I noticed that taring a large amount of stuff causes almost everything else to get swapped out :/ Commented Jul 9, 2015 at 5:28
  • Can you include the output of htop into your question? Commented Jul 9, 2015 at 6:48
  • @Marki555 Sure, I will update the question as soon as I get the chance. Commented Jul 9, 2015 at 6:51
  • 2
    If your /tmp is mounted as tmpfs, then yes, it does. tar itself doesen't seem to use much memory in the screenshot. Commented Jul 9, 2015 at 14:48

1 Answer 1

4

Your image shows quite the contrary, actually.

As you can see under the RES column, tar memory consumption is quite low. You RAM usage appear to increase simply because Linux is actively caching the data read by the tar command. This, in turn, causes memory pressure and dirty page writeback (basically, the system flushes its write cache to accommodate for the greater read-caching required) and, possibly, useful data are evicted from the I/O cache.

Unfortunately, it seems that tar itself can not be instructed to use O_DIRECT or POSIX_FADVISE (both of which can be used to "bypass" the cache). So, using tar there is not a real solution out here...

8
  • Thanks for your explanation. Is there a different tool I can use then that doesn't fill up the read cache? Commented Jul 11, 2015 at 10:42
  • Unfortunately, only some tools support direct I/O operations. The most common tool is dd, and you can use it to compress a file using something as dd if=srcfile bs=1M iflag=direct | bzip2 newfile.bz2. However, this clearly is a no match for a full directory tree tar Commented Jul 11, 2015 at 13:30
  • Thanks for the help. Perhaps I need more ram then...? Commented Jul 11, 2015 at 16:04
  • 1
    You probably need more RAM and a faster disk subsystem. As a workaround, you can try to totally disable filesystem caching during the tar/bz2 process, then reenable it. To disable caching, remount your filesystem with the sync option. For example, using your / filesystem for the tar/bz2 process, you should issue mount / -o remount,sync. Then, after completion, you can remount it with caching enabled using mount / -o remount,async Commented Jul 11, 2015 at 16:28
  • 2
    Update: I have found a tool called nocache which prevents read files from being cached - this seems to solve the problem :D Commented Jul 11, 2015 at 17:23

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.