I did find this thread that was related but I have snapshots.
https://forums.freebsd.org/threads/zfs-dataset-is-occupying-more-space-than-the-actual-data.83901/
A few months ago i had a zfs corruption issue with the server in question and re-partioned.
Since then I've been having zfs send | zfs recv backups that hang every few days.
Well, this one dataset had a lot of variation in the zfs list -t snap REFER column. Over the hours it would climb from 50G to 155G, then go back down, then climb to 100G and go back down, etc. But right now, the latest snap shows 320G!
When I enter the jail and run du I get the same 55G.
Why the wild size discrepancy? 919G? 320G? 55G?
I renamed my backup dataset to preserve all my snaps and destroyed all the snaps on the live server. Now I get:
1/ As far as I know, the pool is fine (says zfs) and there is nothing that I can do to fix the dataset. Any info to the contrary is welcome.
2/ As far as I know, the size discrepancy is indicative of a real problem since there is *NOT* 320G in the dataset now that all the snaps are destroyed. Perhaps the snap space takes time to be re-calculated? I would like to know how zfs handles this. Is it fixed on the next scrub? I looked and found no reference to scrub recalculating sizes. Any info welcome, especially if i can trigger the resize.
3/ in the last month I have had fourty six backups hang on zfs send/recv. About a third of those are from a server that is backing up to itself from an ssd to a hard drive so I don't think SSH has anything to do with it. The datasets vary.
for example:
back/smtp/zsmtp/usr/src
back/smtp/zsmtp/var/crash
back/aujail/jail
back/smtp/zsmtp/var
zgep_back/zgep/var/crash
back/smtp/zsmtp_jail/jmusicbot
zgep_back/zgep/ROOT
zgep_back/zgep/var
zgep_back/zgep/var/crash
zgep_back/zgep/var/crash
zgep_back/zgep/var/crash
zgep_back/zgep/usr
back/smtp/zsmtp/usr
I would love to know:
why do these datasets hang on zfs send/recv?
is there any command i can run to find the datasets that are in a state where they could or would hang?
is there a way to 'clean' them so they don't hang anymore?
should I be thinking of a new zfs pool again? should i only move the data in by rsync and not zfs send/recv?
Any thoughts appreciated.
4/ if the total files in a dataset without snapshots can be 50G but zfs reports 320G, is there some command that I can see what the extra space is for?
https://forums.freebsd.org/threads/zfs-dataset-is-occupying-more-space-than-the-actual-data.83901/
A few months ago i had a zfs corruption issue with the server in question and re-partioned.
Since then I've been having zfs send | zfs recv backups that hang every few days.
Well, this one dataset had a lot of variation in the zfs list -t snap REFER column. Over the hours it would climb from 50G to 155G, then go back down, then climb to 100G and go back down, etc. But right now, the latest snap shows 320G!
Code:
root(4)smtp:~ # df -h /zsmtp_jail/postfix
Filesystem Size Used Avail Capacity Mounted on
zsmtp_jail/postfix 919G 320G 599G 35% /zsmtp_jail/postfix
root(4)smtp:~ # du -hs /zsmtp_jail/postfix/
55G /zsmtp_jail/postfix/
root(4)smtp:~ # zfs list
NAME USED AVAIL REFER MOUNTPOINT
zsmtp_jail/postfix 320G 620G 320G /zsmtp_jail/postfix
When I enter the jail and run du I get the same 55G.
Why the wild size discrepancy? 919G? 320G? 55G?
I renamed my backup dataset to preserve all my snaps and destroyed all the snaps on the live server. Now I get:
Code:
root(4)smtp:~ # zfs list zsmtp_jail/postfix
NAME USED AVAIL REFER MOUNTPOINT
zsmtp_jail/postfix 320G 620G 320G /zsmtp_jail/postfix
Code:
zpool status shows:
pool: zsmtp_jail
state: ONLINE
scan: scrub repaired 0B in 00:27:09 with 0 errors on Fri May 24 04:22:09 2024 <--TODAY!
config:
NAME STATE READ WRITE CKSUM
zsmtp_jail ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
nda1p3 ONLINE 0 0 0
nda2p3 ONLINE 0 0 0
1/ As far as I know, the pool is fine (says zfs) and there is nothing that I can do to fix the dataset. Any info to the contrary is welcome.
2/ As far as I know, the size discrepancy is indicative of a real problem since there is *NOT* 320G in the dataset now that all the snaps are destroyed. Perhaps the snap space takes time to be re-calculated? I would like to know how zfs handles this. Is it fixed on the next scrub? I looked and found no reference to scrub recalculating sizes. Any info welcome, especially if i can trigger the resize.
3/ in the last month I have had fourty six backups hang on zfs send/recv. About a third of those are from a server that is backing up to itself from an ssd to a hard drive so I don't think SSH has anything to do with it. The datasets vary.
for example:
back/smtp/zsmtp/usr/src
back/smtp/zsmtp/var/crash
back/aujail/jail
back/smtp/zsmtp/var
zgep_back/zgep/var/crash
back/smtp/zsmtp_jail/jmusicbot
zgep_back/zgep/ROOT
zgep_back/zgep/var
zgep_back/zgep/var/crash
zgep_back/zgep/var/crash
zgep_back/zgep/var/crash
zgep_back/zgep/usr
back/smtp/zsmtp/usr
I would love to know:
why do these datasets hang on zfs send/recv?
is there any command i can run to find the datasets that are in a state where they could or would hang?
is there a way to 'clean' them so they don't hang anymore?
should I be thinking of a new zfs pool again? should i only move the data in by rsync and not zfs send/recv?
Any thoughts appreciated.
4/ if the total files in a dataset without snapshots can be 50G but zfs reports 320G, is there some command that I can see what the extra space is for?