Why is the total usable size shrinking while copying files to my ZFS dataset?

Question

I have a 2TB USB hard drive that I use as a backup disk. The disk contains a GPT partition table with one partition, type bf00. On that partition, I created a ZFS pool with encryption and compression enabled, and one single dataset.

While I rsyncing my files to the disk, I noticed that the total size of the mounted dataset got smaller and smaller (please note: this is the weird part, it really is the total size, not the available size). How can that be? And how can I use the full capacity?

This is the output of df -h, the total size is already down to 1.2T (rsync is still copying at the moment):

backup/DATA 1,2T 380G 834G 32% /backup

This is zpool list:

# zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT backup 1,81T 964G 892G - - 3% 51% 1.01x ONLINE -

And this is zfs list:

# zfs list NAME USED AVAIL REFER MOUNTPOINT backup 973G 832G 98K none backup/DATA 381G 832G 381G /backup

So it seems there is about one third of the capacity missing, how can that be? Can I reclaim the space somehow? And where did it go? I'm using Arch Linux (5.3.8-arch1-1) with zfs-dkms 0.8.2-1.

BTW: I'm not talking about the 2 TB vs 1.8 TebiByte issue, this is something else.

Update:

Here's the output of zpool status:

zpool status pool: backup state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM backup ONLINE 0 0 0 BackupDisk1 ONLINE 0 0 0 errors: No known data errors

and

zfs list -o space NAME AVAIL USED USEDSNAP USEDDS USEDREFRESERV USEDCHILD backup 793G 1011G 0B 98K 0B 1011G backup/DATA 793G 422G 0B 422G 0B 0B

Latest news:

Ok, I left the system to itself over night just to see what would happen. When I last looked, the figures were like above, the total space of the dataset backup/DATA was shrinking while copying some hundred GB onto it. And even when rsync finished, the drive was busy (as indicated by the LED). There was also a large background CPU usage.

When I took a look this morning, the total size of backup/DATA was back at 1.8TB and all background work has obiously finished. Tadaa! :-)

I think what might have happened is this: rsync was throwing a large amount of files at the dataset. ZFS seems to receive and kind of buffer the files that need to be written. This buffer probably shrinks the total usable size while it exists. As I have compression and encryption enabled on the pool resp. dataset, this may have taken a while (long after rsync finished), even on my quite decent workstation (12 cores, 32 GB RAM), perhaps because the USB drive really isn't fast.

Can somebody confirm that this (or something in that direction) is what happens? I think it would be good to know for everyone who runs into a similar issue.

Do you have any snapshot? Please show the output of zpool status and zfs list - t all; zfs list -o space — shodanshok
– shodanshok, Commented Nov 3, 2019 at 20:01
No, I did not create any snapshots. Please see my updated question. — Rob
– Rob, Commented Nov 3, 2019 at 20:07

shodanshok · Accepted Answer · 2019-11-04 11:55:08Z

You have ~600 GB referred by backup dataset only, with additional 422 GB referred by backup/data.

The approach used by zfs to "publish" the right amount of free space on a filesystem as seen by legacy utilities as df is to alter the total available disk space. While slightly confusing, it produce the correct amount of free space and it is way clearer than, say, what BTRFS does.

In your specific case, as you write on backup (rather than backup/data), the total available space for other dataset/filesystem is reduced accordingly.

EDIT: as the OP confirmed he really wrote nothing on backup, I offer and additional explanation. ZFS features as sort of "delete throttle", where deleted files are de-allocated in the background. As rsync creates and deletes many temporary files in a short time, it is possible that the deleted-but-not-already-deallocated files where counted against the root dataset backup (reducing the AVAIL for backup/data).

Thanks for the explanation. I did not write anything to "backup" itself, just to backup/DATA that is mounted to /backup in linux. How can there be space used on the parent backup dataset? Can I reclaim it or somehow? Or do I just have to keep filling the child dataset and ZFS enlarges backup/DATA automatically? — Rob
– Rob, Commented Nov 3, 2019 at 21:43
Based on the output of zfs list, you did write something on backup. Please double-check and, if unsure, post the output of df and ls -al /backup — shodanshok
– shodanshok, Commented Nov 4, 2019 at 6:35
The explanation is correct, but I don't see where your numbers are coming from. The outputs shown are simply incomplete. backup is truly empty. — user219095
– user219095, Commented Nov 4, 2019 at 6:46
No, I definitely did not write anything to backup, I also made sure by mounting backup directly and taking a look. But I think I found the answer, please see my updated question. — Rob
– Rob, Commented Nov 4, 2019 at 9:03

Stack Exchange Network

Why is the total usable size shrinking while copying files to my ZFS dataset?

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Why is the total usable size shrinking while copying files to my ZFS dataset?

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions