ZFS Strange lost of free space

Hey there,

we have a strange effect. After some time (weeks) the ALLOC raises quite high (here >50GB) over the real file usage. System commands like du and df showing always (nearly) the right usage of the dataset, butt the ALLOC and AVAIL space going completely wrong. See outputs below:
vsd -- /home/vsd is the 'old' dataset with an age of a few weeks. vsdnew -- /home/vsdnew is the 'new' dataset created with rsync from the 'old' dataset with exactly the same data content.

Rsync command used:
rsync -aHAX --fileflags --delete /home/vsd/ /home/vsdnew

Code:
# df -h
Filesystem       Size    Used   Avail Capacity  Mounted on
vsd              126G    2,4G    124G     2%    /home/vsd
vsd/xxxxxx       239G    115G    124G    48%    /home/vsd/xxxxxx
vsdnew           184G    4,3G    180G     2%    /home/vsdnew
vsdnew/xxxxxx    236G     56G    180G    24%    /home/vsdnew/xxxxxx

Code:
# zpool list
NAME     SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
vsd      249G   118G   131G        -         -    12%    47%  1.00x  ONLINE  -
vsdnew   248G  60,1G   188G        -         -     4%    24%  1.00x  ONLINE  -

Code:
# zfs list -t all -o space
NAME           AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
vsd             124G   118G         0   2,38G              0       115G
vsd/xxxxxx      124G   115G         0    115G              0          0
vsdnew          180G  60,1G         0   4,26G              0      55,9G
vsdnew/xxxxxx   180G  55,9G         0   55,9G              0          0


Code:
# zdb
vsdnew:
    version: 5000
    name: 'vsdnew'
    state: 0
    txg: 209774
    pool_guid: 16748974042122959608
    hostid: 1527396945
    hostname: 'xxxxxxxxxxxxxxxxxxxxxx
    com.delphix:has_per_vdev_zaps
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 16748974042122959608
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 16956873139552582793
            path: '/dev/xbd5'
            whole_disk: 1
            metaslab_array: 67
            metaslab_shift: 31
            ashift: 12
            asize: 268430737408
            is_log: 0
            create_txg: 4
            com.delphix:vdev_zap_leaf: 65
            com.delphix:vdev_zap_top: 66
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data

Code:
# zpool get all vsdnew
NAME  PROPERTY                       VALUE                          SOURCE
vsdnew   size                           248G                           -
vsdnew   capacity                       24%                            -
vsdnew  altroot                        -                              default
vsdnew   health                         ONLINE                         -
vsdnew   guid                           16748974042122959608           default
vsdnew   version                        -                              default
vsdnew   bootfs                         -                              default
vsdnew   delegation                     on                             default
vsdnew   autoreplace                    off                            default
vsdnew   cachefile                      -                              default
vsdnew   failmode                       wait                           default
vsdnew   listsnapshots                  off                            default
vsdnew   autoexpand                     off                            default
vsdnew   dedupditto                     0                              default
vsdnew   dedupratio                     1.00x                          -
vsdnew   free                           188G                           -
vsdnew   allocated                      60,2G                          -
vsdnew   readonly                       off                            -
vsdnew   comment                        -                              default
vsdnew   expandsize                     -                              -
vsdnew   freeing                        0                              default
vsdnew   fragmentation                  4%                             -
vsdnew   leaked                         0                              default
vsdnew   bootsize                       -                              default
vsdnew   checkpoint                     -                              -
vsdnew   feature@async_destroy          enabled                        local
vsdnew   feature@empty_bpobj            active                         local
vsdnew   feature@lz4_compress           active                         local
vsdnew   feature@multi_vdev_crash_dump  enabled                        local
vsdnew   feature@spacemap_histogram     active                         local
vsdnew   feature@enabled_txg            active                         local
vsdnew   feature@hole_birth             active                         local
vsdnew   feature@extensible_dataset     enabled                        local
vsdnew   feature@embedded_data          active                         local
vsdnew   feature@bookmarks              enabled                        local
vsdnew   feature@filesystem_limits      enabled                        local
vsdnew   feature@large_blocks           enabled                        local
vsdnew   feature@large_dnode            enabled                        local
vsdnew   feature@sha512                 enabled                        local
vsdnew   feature@skein                  enabled                        local
vsdnew   feature@device_removal         enabled                        local
vsdnew   feature@obsolete_counts        enabled                        local
vsdnew   feature@zpool_checkpoint       enabled                        local
vsdnew   feature@spacemap_v2            active                         local


May be of interest:

- There is a cronjob, create a new, daily snapshot and remove the oldest snapshot (keeping 7 days).
- When we transfer the dataset with zfs send/receive into another zfs pool, the same problem exists on the new zfs pool. So it seems, zfs 'thinks' the lost space is 'correct'. It's transfered into the new pool. Whatever there is transfered at all...


So - as this is a hugh lost of data: Has anyone an idea why all this data ist lost? WHERE is it?


Many thanks in advance for your help!
jimmy
 
what is your version of FreeBSD? What is the output of
zpool get all vsdnew, zfs get all vsdnew, zfs get all vsd?
It's a modified Version of 12.0-RELEASE-p10 - nothing changed within the kernel regarding filesystem code.
Which values do you need from zpool get?
 
I agree with rootbert, we only have half the information to compare.

My initial thought might be that the copies property had been set to 2 on vsd - but that is speculation without seeing anything else.
 
I agree with rootbert, we only have half the information to compare.

My initial thought might be that the copies property had been set to 2 on vsd - but that is speculation without seeing anything else.
It's not that I don't agree - just wanted prevent to much data. I also made a mistake in my first post - the zpool get all and zdb output both showed vsdNEW after it was renamed to vsd. So this is confusion, but should make no difference as the output below should show. Now with the correct names.

The only remarkable difference in my view is that I changed recordsize from 128K to 8K in vsdnew (+ashift 9->12), but I can confirm from many other tests we did before: that does not make any difference. When I zfs send/receive or rsync to a new pool with 128K recordsize (ashift 9) - the effect is the same - zfs send/receive transfers the 'wrong' data, rsync only the right portion.

Of cource copies is not set anything else than 1 - but no problem - here we are. I really appeiciate any help!

Code:
# zfs get all vsd
NAME    PROPERTY              VALUE                   SOURCE
vsd  type                  filesystem              -
vsd  creation              Mo. Jan. 29 20:26 2024  -
vsd  used                  118G                    -
vsd  available             124G                    -
vsd  referenced            2,38G                   -
vsd  compressratio         1.12x                   -
vsd  mounted               no                      -
vsd  quota                 none                    default
vsd  reservation           none                    default
vsd  recordsize            128K                    default
vsd  mountpoint            /home/vsd               received
vsd  sharenfs              off                     default
vsd  checksum              fletcher4               received
vsd  compression           lz4                     received
vsd  atime                 off                     received
vsd  devices               on                      default
vsd  exec                  on                      default
vsd  setuid                on                      default
vsd  readonly              off                     default
vsd  jailed                on                      received
vsd  snapdir               hidden                  received
vsd  aclmode               discard                 default
vsd  aclinherit            restricted              default
vsd  createtxg             1                       -
vsd  canmount              on                      default
vsd  xattr                 on                      default
vsd  copies                1                       default
vsd  version               5                       -
vsd  utf8only              off                     -
vsd  normalization         none                    -
vsd  casesensitivity       sensitive               -
vsd  vscan                 off                     default
vsd  nbmand                off                     default
vsd  sharesmb              off                     default
vsd  refquota              none                    default
vsd  refreservation        none                    default
vsd  guid                  10183450069749680352    -
vsd  primarycache          all                     default
vsd  secondarycache        metadata                received
vsd  usedbysnapshots       0                       -
vsd  usedbydataset         2,38G                   -
vsd  usedbychildren        115G                    -
vsd  usedbyrefreservation  0                       -
vsd  logbias               latency                 default
vsd  dedup                 off                     default
vsd  mlslabel                                      -
vsd  sync                  standard                default
vsd  dnodesize             legacy                  default
vsd  refcompressratio      2.12x                   -
vsd  written               2,38G                   -
vsd  logicalused           132G                    -
vsd  logicalreferenced     5,01G                   -
vsd  volmode               default                 default
vsd  filesystem_limit      none                    default
vsd  snapshot_limit        none                    default
vsd  filesystem_count      none                    default
vsd  snapshot_count        none                    default
vsd  redundant_metadata    all                     default


Code:
# zfs get all vsdnew
NAME  PROPERTY              VALUE                   SOURCE
vsdnew   type                  filesystem              -
vsdnew   creation              Di. Feb.  6 16:16 2024  -
vsdnew   used                  60,5G                   -
vsdnew   available             180G                    -
vsdnew   referenced            4,26G                   -
vsdnew   compressratio         1.11x                   -
vsdnew   mounted               yes                     -
vsdnew   quota                 none                    default
vsdnew   reservation           none                    default
vsdnew   recordsize            8K                      local
vsdnew   mountpoint            /home/vsdnew            received
vsdnew   sharenfs              off                     default
vsdnew   checksum              fletcher4               local
vsdnew   compression           lz4                     local
vsdnew   atime                 off                     local
vsdnew   devices               on                      default
vsdnew   exec                  on                      default
vsdnew   setuid                on                      default
vsdnew   readonly              off                     default
vsdnew   jailed                off                     default
vsdnew   snapdir               hidden                  local
vsdnew   aclmode               discard                 default
vsdnew   aclinherit            restricted              default
vsdnew   createtxg             1                       -
vsdnew   canmount              on                      default
vsdnew   xattr                 off                     temporary
vsdnew   copies                1                       default
vsdnew   version               5                       -
vsdnew   utf8only              off                     -
vsdnew   normalization         none                    -
vsdnew   casesensitivity       sensitive               -
vsdnew   vscan                 off                     default
vsdnew   nbmand                off                     default
vsdnew   sharesmb              off                     default
vsdnew   refquota              none                    default
vsdnew   refreservation        none                    default
vsdnew   guid                  8745797413885517509     -
vsdnew   primarycache          all                     default
vsdnew   secondarycache        metadata                local
vsdnew   usedbysnapshots       0                       -
vsdnew   usedbydataset         4,26G                   -
vsdnew   usedbychildren        56,3G                   -
vsdnew   usedbyrefreservation  0                       -
vsdnew   logbias               latency                 default
vsdnew   dedup                 off                     default
vsdnew   mlslabel                                      -
vsdnew   sync                  standard                default
vsdnew   dnodesize             legacy                  default
vsdnew   refcompressratio      1.29x                   -
vsdnew   written               4,26G                   -
vsdnew   logicalused           64,9G                   -
vsdnew   logicalreferenced     4,99G                   -
vsdnew   volmode               default                 default
vsdnew   filesystem_limit      none                    default
vsdnew   snapshot_limit        none                    default
vsdnew   filesystem_count      none                    default
vsdnew   snapshot_count        none                    default
vsdnew   redundant_metadata    all                     default

Code:
# zpool get all vsd
NAME    PROPERTY                       VALUE                          SOURCE
vsd  size                           249G                           -
vsd  capacity                       47%                            -
vsd  altroot                        -                              default
vsd  health                         ONLINE                         -
vsd  guid                           1132442073773096964            default
vsd  version                        -                              default
vsd  bootfs                         -                              default
vsd  delegation                     on                             default
vsd  autoreplace                    off                            default
vsd  cachefile                      -                              default
vsd  failmode                       wait                           default
vsd  listsnapshots                  off                            default
vsd  autoexpand                     off                            default
vsd  dedupditto                     0                              default
vsd  dedupratio                     1.00x                          -
vsd  free                           131G                           -
vsd  allocated                      118G                           -
vsd  readonly                       off                            -
vsd  comment                        -                              default
vsd  expandsize                     -                              -
vsd  freeing                        0                              default
vsd  fragmentation                  12%                            -
vsd  leaked                         0                              default
vsd  bootsize                       -                              default
vsd  checkpoint                     -                              -
vsd  feature@async_destroy          enabled                        local
vsd  feature@empty_bpobj            active                         local
vsd  feature@lz4_compress           active                         local
vsd  feature@multi_vdev_crash_dump  enabled                        local
vsd  feature@spacemap_histogram     active                         local
vsd  feature@enabled_txg            active                         local
vsd  feature@hole_birth             active                         local
vsd  feature@extensible_dataset     enabled                        local
vsd  feature@embedded_data          active                         local
vsd  feature@bookmarks              enabled                        local
vsd  feature@filesystem_limits      enabled                        local
vsd  feature@large_blocks           enabled                        local
vsd  feature@large_dnode            enabled                        local
vsd  feature@sha512                 enabled                        local
vsd  feature@skein                  enabled                        local
vsd  feature@device_removal         enabled                        local
vsd  feature@obsolete_counts        enabled                        local
vsd  feature@zpool_checkpoint       enabled                        local
vsd  feature@spacemap_v2            active                         local


Code:
# zpool get all vsdnew
NAME  PROPERTY                       VALUE                          SOURCE
vsdnew   size                           248G                           -
vsdnew   capacity                       24%                            -
vsdnew   altroot                        -                              default
vsdnew   health                         ONLINE                         -
vsdnew   guid                           16748974042122959608           default
vsdnew   version                        -                              default
vsdnew   bootfs                         -                              default
vsdnew   delegation                     on                             default
vsdnew   autoreplace                    off                            default
vsdnew   cachefile                      -                              default
vsdnew   failmode                       wait                           default
vsdnew   listsnapshots                  off                            default
vsdnew   autoexpand                     off                            default
vsdnew   dedupditto                     0                              default
vsdnew   dedupratio                     1.00x                          -
vsdnew   free                           187G                           -
vsdnew   allocated                      60,5G                          -
vsdnew   readonly                       off                            -
vsdnew   comment                        -                              default
vsdnew   expandsize                     -                              -
vsdnew   freeing                        0                              default
vsdnew   fragmentation                  5%                             -
vsdnew   leaked                         0                              default
vsdnew   bootsize                       -                              default
vsdnew   checkpoint                     -                              -
vsdnew   feature@async_destroy          enabled                        local
vsdnew   feature@empty_bpobj            active                         local
vsdnew   feature@lz4_compress           active                         local
vsdnew   feature@multi_vdev_crash_dump  enabled                        local
vsdnew   feature@spacemap_histogram     active                         local
vsdnew   feature@enabled_txg            active                         local
vsdnew   feature@hole_birth             active                         local
vsdnew   feature@extensible_dataset     enabled                        local
vsdnew   feature@embedded_data          active                         local
vsdnew   feature@bookmarks              enabled                        local
vsdnew   feature@filesystem_limits      enabled                        local
vsdnew   feature@large_blocks           enabled                        local
vsdnew   feature@large_dnode            enabled                        local
vsdnew   feature@sha512                 enabled                        local
vsdnew   feature@skein                  enabled                        local
vsdnew   feature@device_removal         enabled                        local
vsdnew   feature@obsolete_counts        enabled                        local
vsdnew   feature@zpool_checkpoint       enabled                        local
vsdnew   feature@spacemap_v2            active                         local


Code:
# zdb
vsdnew:
    version: 5000
    name: 'vsdnew'
    state: 0
    txg: 209839
    pool_guid: 16748974042122959608
    hostid: 1527396945
    hostname: 'xxxxxxxxxxxxxxxxxx'
    com.delphix:has_per_vdev_zaps
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 16748974042122959608
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 16956873139552582793
            path: '/dev/xbd4'
            whole_disk: 1
            metaslab_array: 67
            metaslab_shift: 31
            ashift: 12
            asize: 268430737408
            is_log: 0
            DTL: 172
            create_txg: 4
            com.delphix:vdev_zap_leaf: 65
            com.delphix:vdev_zap_top: 66
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
vsd:
    version: 5000
    name: 'vsd'
    state: 0
    txg: 375079
    pool_guid: 1132442073773096964
    hostid: 1527396945
    hostname: 'xxxxxxxxxxxxxxxxxxxx'
    com.delphix:has_per_vdev_zaps
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 1132442073773096964
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 8774192215507463996
            path: '/dev/xbd5'
            whole_disk: 1
            metaslab_array: 67
            metaslab_shift: 30
            ashift: 9
            asize: 268430737408
            is_log: 0
            DTL: 374
            create_txg: 4
            com.delphix:vdev_zap_leaf: 65
            com.delphix:vdev_zap_top: 66
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
 
It's a modified Version of 12.0-RELEASE-p10
Please note that 12.0 is end-of-life since February, 2020. As a matter of fact the entire 12 major branch is end of life since December, 2023.
 
Please note that 12.0 is end-of-life since February, 2020. As a matter of fact the entire 12 major branch is end of life since December, 2023.
Sure. But we have a few thousand of such machine and can't update them as we would like. If we could, we would. Be sure.
In my understanding, this is could be a MAJOR problem of zfs, if that there are no special mistakes regarding our configs. I never read about such an effect the last years in no report, so I guess it's not fixed by any OS update at all. So, I assume that is of interest by any zfs user.

Especially as this effect comes really slowly. We noticed minimal effects like this in 22 with some MB/GB one some machines which we ignored, as noone really cared about the amount. But now we found a higher number of such machines with remarkable numbers of GBs lost.

As we have of course not left out the possibility that we have made mistakes somewhere, any help and advice is helpful and welcome.
 
In my understanding, this is could be a MAJOR problem of zfs, if that there are no special mistakes regarding our configs. I never read about such an effect the last years in no report, so I guess it's not fixed by any OS update at all.
12 still had the "old" ZFS that was imported from Solaris. 13 and later switched to OpenZFS.
 
It's not that I don't agree - just wanted prevent to much data. I also made a mistake in my first post - the zpool get all and zdb output both showed vsdNEW after it was renamed to vsd. So this is confusion, but should make no difference as the output below should show. Now with the correct names.

The only remarkable difference in my view is that I changed recordsize from 128K to 8K in vsdnew (+ashift 9->12), but I can confirm from many other tests we did before: that does not make any difference. When I zfs send/receive or rsync to a new pool with 128K recordsize (ashift 9) - the effect is the same - zfs send/receive transfers the 'wrong' data, rsync only the right portion.

Of cource copies is not set anything else than 1 - but no problem - here we are. I really appeiciate any help!

Code:
# zfs get all vsd
NAME    PROPERTY              VALUE                   SOURCE
vsd  type                  filesystem              -
vsd  creation              Mo. Jan. 29 20:26 2024  -
vsd  used                  118G                    -
vsd  available             124G                    -
vsd  referenced            2,38G                   -
vsd  compressratio         1.12x                   -
vsd  mounted               no                      -
vsd  quota                 none                    default
vsd  reservation           none                    default
vsd  recordsize            128K                    default
vsd  mountpoint            /home/vsd               received
vsd  sharenfs              off                     default
vsd  checksum              fletcher4               received
vsd  compression           lz4                     received
vsd  atime                 off                     received
vsd  devices               on                      default
vsd  exec                  on                      default
vsd  setuid                on                      default
vsd  readonly              off                     default
vsd  jailed                on                      received
vsd  snapdir               hidden                  received
vsd  aclmode               discard                 default
vsd  aclinherit            restricted              default
vsd  createtxg             1                       -
vsd  canmount              on                      default
vsd  xattr                 on                      default
vsd  copies                1                       default
vsd  version               5                       -
vsd  utf8only              off                     -
vsd  normalization         none                    -
vsd  casesensitivity       sensitive               -
vsd  vscan                 off                     default
vsd  nbmand                off                     default
vsd  sharesmb              off                     default
vsd  refquota              none                    default
vsd  refreservation        none                    default
vsd  guid                  10183450069749680352    -
vsd  primarycache          all                     default
vsd  secondarycache        metadata                received
vsd  usedbysnapshots       0                       -
vsd  usedbydataset         2,38G                   -
vsd  usedbychildren        115G                    -
vsd  usedbyrefreservation  0                       -
vsd  logbias               latency                 default
vsd  dedup                 off                     default
vsd  mlslabel                                      -
vsd  sync                  standard                default
vsd  dnodesize             legacy                  default
vsd  refcompressratio      2.12x                   -
vsd  written               2,38G                   -
vsd  logicalused           132G                    -
vsd  logicalreferenced     5,01G                   -
vsd  volmode               default                 default
vsd  filesystem_limit      none                    default
vsd  snapshot_limit        none                    default
vsd  filesystem_count      none                    default
vsd  snapshot_count        none                    default
vsd  redundant_metadata    all                     default


Code:
# zfs get all vsdnew
NAME  PROPERTY              VALUE                   SOURCE
vsdnew   type                  filesystem              -
vsdnew   creation              Di. Feb.  6 16:16 2024  -
vsdnew   used                  60,5G                   -
vsdnew   available             180G                    -
vsdnew   referenced            4,26G                   -
vsdnew   compressratio         1.11x                   -
vsdnew   mounted               yes                     -
vsdnew   quota                 none                    default
vsdnew   reservation           none                    default
vsdnew   recordsize            8K                      local
vsdnew   mountpoint            /home/vsdnew            received
vsdnew   sharenfs              off                     default
vsdnew   checksum              fletcher4               local
vsdnew   compression           lz4                     local
vsdnew   atime                 off                     local
vsdnew   devices               on                      default
vsdnew   exec                  on                      default
vsdnew   setuid                on                      default
vsdnew   readonly              off                     default
vsdnew   jailed                off                     default
vsdnew   snapdir               hidden                  local
vsdnew   aclmode               discard                 default
vsdnew   aclinherit            restricted              default
vsdnew   createtxg             1                       -
vsdnew   canmount              on                      default
vsdnew   xattr                 off                     temporary
vsdnew   copies                1                       default
vsdnew   version               5                       -
vsdnew   utf8only              off                     -
vsdnew   normalization         none                    -
vsdnew   casesensitivity       sensitive               -
vsdnew   vscan                 off                     default
vsdnew   nbmand                off                     default
vsdnew   sharesmb              off                     default
vsdnew   refquota              none                    default
vsdnew   refreservation        none                    default
vsdnew   guid                  8745797413885517509     -
vsdnew   primarycache          all                     default
vsdnew   secondarycache        metadata                local
vsdnew   usedbysnapshots       0                       -
vsdnew   usedbydataset         4,26G                   -
vsdnew   usedbychildren        56,3G                   -
vsdnew   usedbyrefreservation  0                       -
vsdnew   logbias               latency                 default
vsdnew   dedup                 off                     default
vsdnew   mlslabel                                      -
vsdnew   sync                  standard                default
vsdnew   dnodesize             legacy                  default
vsdnew   refcompressratio      1.29x                   -
vsdnew   written               4,26G                   -
vsdnew   logicalused           64,9G                   -
vsdnew   logicalreferenced     4,99G                   -
vsdnew   volmode               default                 default
vsdnew   filesystem_limit      none                    default
vsdnew   snapshot_limit        none                    default
vsdnew   filesystem_count      none                    default
vsdnew   snapshot_count        none                    default
vsdnew   redundant_metadata    all                     default

Code:
# zpool get all vsd
NAME    PROPERTY                       VALUE                          SOURCE
vsd  size                           249G                           -
vsd  capacity                       47%                            -
vsd  altroot                        -                              default
vsd  health                         ONLINE                         -
vsd  guid                           1132442073773096964            default
vsd  version                        -                              default
vsd  bootfs                         -                              default
vsd  delegation                     on                             default
vsd  autoreplace                    off                            default
vsd  cachefile                      -                              default
vsd  failmode                       wait                           default
vsd  listsnapshots                  off                            default
vsd  autoexpand                     off                            default
vsd  dedupditto                     0                              default
vsd  dedupratio                     1.00x                          -
vsd  free                           131G                           -
vsd  allocated                      118G                           -
vsd  readonly                       off                            -
vsd  comment                        -                              default
vsd  expandsize                     -                              -
vsd  freeing                        0                              default
vsd  fragmentation                  12%                            -
vsd  leaked                         0                              default
vsd  bootsize                       -                              default
vsd  checkpoint                     -                              -
vsd  feature@async_destroy          enabled                        local
vsd  feature@empty_bpobj            active                         local
vsd  feature@lz4_compress           active                         local
vsd  feature@multi_vdev_crash_dump  enabled                        local
vsd  feature@spacemap_histogram     active                         local
vsd  feature@enabled_txg            active                         local
vsd  feature@hole_birth             active                         local
vsd  feature@extensible_dataset     enabled                        local
vsd  feature@embedded_data          active                         local
vsd  feature@bookmarks              enabled                        local
vsd  feature@filesystem_limits      enabled                        local
vsd  feature@large_blocks           enabled                        local
vsd  feature@large_dnode            enabled                        local
vsd  feature@sha512                 enabled                        local
vsd  feature@skein                  enabled                        local
vsd  feature@device_removal         enabled                        local
vsd  feature@obsolete_counts        enabled                        local
vsd  feature@zpool_checkpoint       enabled                        local
vsd  feature@spacemap_v2            active                         local


Code:
# zpool get all vsdnew
NAME  PROPERTY                       VALUE                          SOURCE
vsdnew   size                           248G                           -
vsdnew   capacity                       24%                            -
vsdnew   altroot                        -                              default
vsdnew   health                         ONLINE                         -
vsdnew   guid                           16748974042122959608           default
vsdnew   version                        -                              default
vsdnew   bootfs                         -                              default
vsdnew   delegation                     on                             default
vsdnew   autoreplace                    off                            default
vsdnew   cachefile                      -                              default
vsdnew   failmode                       wait                           default
vsdnew   listsnapshots                  off                            default
vsdnew   autoexpand                     off                            default
vsdnew   dedupditto                     0                              default
vsdnew   dedupratio                     1.00x                          -
vsdnew   free                           187G                           -
vsdnew   allocated                      60,5G                          -
vsdnew   readonly                       off                            -
vsdnew   comment                        -                              default
vsdnew   expandsize                     -                              -
vsdnew   freeing                        0                              default
vsdnew   fragmentation                  5%                             -
vsdnew   leaked                         0                              default
vsdnew   bootsize                       -                              default
vsdnew   checkpoint                     -                              -
vsdnew   feature@async_destroy          enabled                        local
vsdnew   feature@empty_bpobj            active                         local
vsdnew   feature@lz4_compress           active                         local
vsdnew   feature@multi_vdev_crash_dump  enabled                        local
vsdnew   feature@spacemap_histogram     active                         local
vsdnew   feature@enabled_txg            active                         local
vsdnew   feature@hole_birth             active                         local
vsdnew   feature@extensible_dataset     enabled                        local
vsdnew   feature@embedded_data          active                         local
vsdnew   feature@bookmarks              enabled                        local
vsdnew   feature@filesystem_limits      enabled                        local
vsdnew   feature@large_blocks           enabled                        local
vsdnew   feature@large_dnode            enabled                        local
vsdnew   feature@sha512                 enabled                        local
vsdnew   feature@skein                  enabled                        local
vsdnew   feature@device_removal         enabled                        local
vsdnew   feature@obsolete_counts        enabled                        local
vsdnew   feature@zpool_checkpoint       enabled                        local
vsdnew   feature@spacemap_v2            active                         local


Code:
# zdb
vsdnew:
    version: 5000
    name: 'vsdnew'
    state: 0
    txg: 209839
    pool_guid: 16748974042122959608
    hostid: 1527396945
    hostname: 'xxxxxxxxxxxxxxxxxx'
    com.delphix:has_per_vdev_zaps
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 16748974042122959608
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 16956873139552582793
            path: '/dev/xbd4'
            whole_disk: 1
            metaslab_array: 67
            metaslab_shift: 31
            ashift: 12
            asize: 268430737408
            is_log: 0
            DTL: 172
            create_txg: 4
            com.delphix:vdev_zap_leaf: 65
            com.delphix:vdev_zap_top: 66
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
vsd:
    version: 5000
    name: 'vsd'
    state: 0
    txg: 375079
    pool_guid: 1132442073773096964
    hostid: 1527396945
    hostname: 'xxxxxxxxxxxxxxxxxxxx'
    com.delphix:has_per_vdev_zaps
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 1132442073773096964
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 8774192215507463996
            path: '/dev/xbd5'
            whole_disk: 1
            metaslab_array: 67
            metaslab_shift: 30
            ashift: 9
            asize: 268430737408
            is_log: 0
            DTL: 374
            create_txg: 4
            com.delphix:vdev_zap_leaf: 65
            com.delphix:vdev_zap_top: 66
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data

I think your problem may be in the size of the recordsize since this affects the size of the blocks. Your problem may be greater due to the number of files and size, I will explain my example but surely you need the opinion of another user with more experience.

We create two datasheets with different recordsizes, as is your case:

Code:
zfs create -o recordsize=8k -o mountpoint=/mnt/v1 zroot/test_v1

### Default recordsize is 128k, no need to specify
zfs create -o mountpoint=/mnt/v2 zroot/test_v2

We create a small loop to generate random files in the two datasheets:

Code:
foreach i (`seq 1 100`)
dd if=/dev/random bs=8k count=100 > /mnt/v1/file.$i
dd if=/dev/random bs=8k count=100 > /mnt/v2/file.$i
end

Check the size:

Code:
# zfs list |grep -e test
zroot/test_v1       51.9M  10.5G  51.9M  /mnt/v1
zroot/test_v2       52.9M  10.5G  52.9M  /mnt/v2

As you can see there is a small difference in 1MB, but if you check the file size in blocks with zdb you can keep the following:

Code:
# ls -i /mnt/v2/file.1
673 /mnt/v2/file.1

# zdb -ddddd zroot/test_v1 678
Dataset zroot/test_v1 [ZPL], ID 225, cr_txg 92236, 40.0M, 107 objects, rootbp DVA[0]=<0:34cce1000:1000> DVA[1]=<0:64f57000:1000> [L0 DMU objset] fletcher4 uncompressed unencrypted LE contiguous unique double size=1000L/1000P birth=92815L/92815P fill=107 cksum=0000000dd9210249:000024a25e7575c7:003425f4ed2ea173:3513c2d546f0f81a

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
       678    2   128K     8K   408K     512   400K  100.00  ZFS plain file
                                               168   bonus  System attributes
        dnode flags: USED_BYTES USERUSED_ACCOUNTED USEROBJUSED_ACCOUNTED
        dnode maxblkid: 49
        path    /file.1
        uid     0
        gid     0
        atime   Mon Feb 19 18:16:56 2024
        mtime   Mon Feb 19 18:19:21 2024
        ctime   Mon Feb 19 18:19:21 2024
        crtime  Mon Feb 19 18:16:56 2024
        gen     92781
        mode    100644
        size    409600
        parent  34
        links   1
        pflags  40800000004
Indirect blocks:
               0 L1  0:34c1a7000:1000 20000L/1000P F=50 B=92813/92813 cksum=0000014fbbb39c11:0003460e35676022:047899d7da6c2e8c:65e70be73c39e62a
               0  L0 0:21d946000:2000 2000L/2000P F=1 B=92813/92813 cksum=0000041e0a4c2267:0010dab7fc5936f8:2d548dff08df8163:1ca78724d2063672
            2000  L0 0:21d948000:2000 2000L/2000P F=1 B=92813/92813 cksum=000003ff7f2ba6d8:001026e14d6ac816:2b63c5588ade660c:52d5a1c8b6d7c4d2
            4000  L0 0:21d94a000:2000 2000L/2000P F=1 B=92813/92813
          ...............

                segment [0000000000000000, 0000000000064000) size  400K
               
# zdb -ddddd zroot/test_v2 673
Dataset zroot/test_v2 [ZPL], ID 326, cr_txg 92239, 40.4M, 107 objects, rootbp DVA[0]=<0:2bb645000:1000> DVA[1]=<0:1c646d000:1000> [L0 DMU objset] fletcher4 uncompressed unencrypted LE contiguous unique double size=1000L/1000P birth=92815L/92815P fill=107 cksum=0000000eb4e5235a:0000276695cd9418:0037faf402041932:381a3b27619035ef

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
       673    2   128K   128K   412K     512   512K  100.00  ZFS plain file
                                               168   bonus  System attributes
        dnode flags: USED_BYTES USERUSED_ACCOUNTED USEROBJUSED_ACCOUNTED
        dnode maxblkid: 3
        path    /file.1
        uid     0
        gid     0
        atime   Mon Feb 19 18:16:56 2024
        mtime   Mon Feb 19 18:19:21 2024
        ctime   Mon Feb 19 18:19:21 2024
        crtime  Mon Feb 19 18:16:56 2024
        gen     92781
        mode    100644
        size    409600
        parent  34
        links   1
        pflags  40800000004
Indirect blocks:
               0 L1  0:21db3d000:1000 20000L/1000P F=4 B=92813/92813 cksum=0000009215e553b6:0002087cfee9fea0:03a2ad26bac2e16f:5990a572ea8b8a05
               0  L0 0:2d70ac000:20000 20000L/20000P F=1 B=92813/92813 cksum=0000400491d297d1:0ffb5c4c71e82b1b:76ad4824271161b8:11430aeba9106830
           20000  L0 0:2d70cc000:20000 20000L/20000P F=1 B=92813/92813 cksum=00004015c62ee47a:10100d913e36c0d3:f041d12af1572e38:365ca18ed5432307
           40000  L0 0:2d70ec000:20000 20000L/20000P F=1 B=92813/92813 cksum=0000403c91cf079f:100d65ff27a0e660:5690ad9c13f48ccc:a8d9753fcf1a6396
           60000  L0 0:2d710c000:5000 20000L/5000P F=1 B=92813/92813 cksum=0000086ff6677a67:00618816bb88f728:9868346b7d7024e1:86bfd17098331333
           80000  L0 0:0:0 20000L B=92813
           a0000  L0 0:0:0 20000L B=92813
           c0000  L0 0:0:0 20000L B=92813

                segment [0000000000000000, 0000000000080000) size  512K

There may be a difference in the size of the segment, maybe that is your problem or a part of it.
 
freejlr: thanks a lot for taking time. The Problem is: When I rsync the data to a pool with 128K, the space is freed, too. I can hardly imagin, that a rewrite etc. over time can result in a doubled usage of space, because of 128K recordsize. Is that experienced out there? There is a mix of files in differend sizes on that machine, not only small files. hmm...

rootbert:

Code:
zfs get all vsd/xxxxxx
NAME           PROPERTY              VALUE                   SOURCE
vsd/xxxxxx  type                  filesystem              -
vsd/xxxxxx  creation              Mo. Jan. 29 20:27 2024  -
vsd/xxxxxx  used                  115G                    -
vsd/xxxxxx  available             124G                    -
vsd/xxxxxx  referenced            115G                    -
vsd/xxxxxx  compressratio         1.10x                   -
vsd/xxxxxx  mounted               no                      -
vsd/xxxxxx  quota                 none                    default
vsd/xxxxxx  reservation           none                    default
vsd/xxxxxx  recordsize            128K                    default
vsd/xxxxxx  mountpoint            /home/vsd/xxxxxx        inherited from vsd
vsd/xxxxxx  sharenfs              off                     default
vsd/xxxxxx  checksum              fletcher4               inherited from vsd
vsd/xxxxxx  compression           lz4                     inherited from vsd
vsd/xxxxxx  atime                 off                     inherited from vsd
vsd/xxxxxx  devices               on                      default
vsd/xxxxxx  exec                  on                      default
vsd/xxxxxx  setuid                on                      default
vsd/xxxxxx  readonly              off                     default
vsd/xxxxxx  jailed                on                      inherited from vsd
vsd/xxxxxx  snapdir               hidden                  inherited from vsd
vsd/xxxxxx  aclmode               discard                 default
vsd/xxxxxx  aclinherit            restricted              default
vsd/xxxxxx  createtxg             41                      -
vsd/xxxxxx  canmount              on                      default
vsd/xxxxxx  xattr                 on                      default
vsd/xxxxxx  copies                1                       default
vsd/xxxxxx  version               5                       -
vsd/xxxxxx  utf8only              off                     -
vsd/xxxxxx  normalization         none                    -
vsd/xxxxxx  casesensitivity       sensitive               -
vsd/xxxxxx  vscan                 off                     default
vsd/xxxxxx  nbmand                off                     default
vsd/xxxxxx  sharesmb              off                     default
vsd/xxxxxx  refquota              none                    default
vsd/xxxxxx  refreservation        none                    default
vsd/xxxxxx  guid                  15950033646381232234    -
vsd/xxxxxx  primarycache          all                     default
vsd/xxxxxx  secondarycache        metadata                inherited from vsd
vsd/xxxxxx  usedbysnapshots       0                       -
vsd/xxxxxx  usedbydataset         115G                    -
vsd/xxxxxx  usedbychildren        0                       -
vsd/xxxxxx  usedbyrefreservation  0                       -
vsd/xxxxxx  logbias               latency                 default
vsd/xxxxxx  dedup                 off                     default
vsd/xxxxxx  mlslabel                                      -
vsd/xxxxxx  sync                  standard                default
vsd/xxxxxx  dnodesize             legacy                  default
vsd/xxxxxx  refcompressratio      1.10x                   -
vsd/xxxxxx  written               115G                    -
vsd/xxxxxx  logicalused           127G                    -
vsd/xxxxxx  logicalreferenced     127G                    -
vsd/xxxxxx  volmode               default                 default
vsd/xxxxxx  filesystem_limit      none                    default
vsd/xxxxxx  snapshot_limit        none                    default
vsd/xxxxxx  filesystem_count      none                    default
vsd/xxxxxx  snapshot_count        none                    default
vsd/xxxxxx  redundant_metadata    all                     default


Code:
# zfs get all vsdnew/xxxxxx
NAME        PROPERTY              VALUE                   SOURCE
vsdnew/xxxxxx  type                  filesystem              -
vsdnew/xxxxxx  creation              Di. Feb.  6 16:17 2024  -
vsdnew/xxxxxx  used                  56,3G                   -
vsdnew/xxxxxx  available             180G                    -
vsdnew/xxxxxx  referenced            56,0G                   -
vsdnew/xxxxxx  compressratio         1.10x                   -
vsdnew/xxxxxx  mounted               yes                     -
vsdnew/xxxxxx  quota                 none                    default
vsdnew/xxxxxx  reservation           none                    default
vsdnew/xxxxxx  recordsize            8K                      inherited from vsdnew
vsdnew/xxxxxx  mountpoint            /home/vsdnew/xxxxxx             default
vsdnew/xxxxxx  sharenfs              off                     default
vsdnew/xxxxxx  checksum              fletcher4               inherited from vsdnew
vsdnew/xxxxxx  compression           lz4                     inherited from vsdnew
vsdnew/xxxxxx  atime                 off                     inherited from vsdnew
vsdnew/xxxxxx  devices               on                      default
vsdnew/xxxxxx  exec                  on                      default
vsdnew/xxxxxx  setuid                on                      default
vsdnew/xxxxxx  readonly              off                     default
vsdnew/xxxxxx  jailed                off                     default
vsdnew/xxxxxx  snapdir               hidden                  inherited from vsdnew
vsdnew/xxxxxx  aclmode               discard                 default
vsdnew/xxxxxx  aclinherit            restricted              default
vsdnew/xxxxxx  createtxg             13                      -
vsdnew/xxxxxx  canmount              on                      default
vsdnew/xxxxxx  xattr                 off                     temporary
vsdnew/xxxxxx  copies                1                       default
vsdnew/xxxxxx  version               5                       -
vsdnew/xxxxxx  utf8only              off                     -
vsdnew/xxxxxx  normalization         none                    -
vsdnew/xxxxxx  casesensitivity       sensitive               -
vsdnew/xxxxxx  vscan                 off                     default
vsdnew/xxxxxx  nbmand                off                     default
vsdnew/xxxxxx  sharesmb              off                     default
vsdnew/xxxxxx  refquota              none                    default
vsdnew/xxxxxx  refreservation        none                    default
vsdnew/xxxxxx  guid                  14495481347814093474    -
vsdnew/xxxxxx  primarycache          all                     default
vsdnew/xxxxxx  secondarycache        metadata                inherited from vsdnew
vsdnew/xxxxxx  usedbysnapshots       308M                    -
vsdnew/xxxxxx  usedbydataset         56,0G                   -
vsdnew/xxxxxx  usedbychildren        0                       -
vsdnew/xxxxxx  usedbyrefreservation  0                       -
vsdnew/xxxxxx  logbias               latency                 default
vsdnew/xxxxxx  dedup                 off                     default
vsdnew/xxxxxx  mlslabel                                      -
vsdnew/xxxxxx  sync                  standard                default
vsdnew/xxxxxx  dnodesize             legacy                  default
vsdnew/xxxxxx  refcompressratio      1.10x                   -
vsdnew/xxxxxx  written               303M                    -
vsdnew/xxxxxx  logicalused           59,9G                   -
vsdnew/xxxxxx  logicalreferenced     59,4G                   -
vsdnew/xxxxxx  volmode               default                 default
vsdnew/xxxxxx  filesystem_limit      none                    default
vsdnew/xxxxxx  snapshot_limit        none                    default
vsdnew/xxxxxx  filesystem_count      none                    default
vsdnew/xxxxxx  snapshot_count        none                    default
vsdnew/xxxxxx  redundant_metadata    all                     default
 
If it's about recordsize, would'nt the values of logicalused and logicalreferenced look different?
 
rootbert - sure ;)

Code:
# zdb -dv vsd/xxxxxx | head -n15
Dataset vsd/xxxxxx [ZPL], ID 91, cr_txg 41, 115G, 627054 objects


    ZIL header: claim_txg 375076, claim_blk_seq 1, claim_lr_seq 0 replay_seq 0, flags 0x2




    Object  lvl   iblk   dblk  dsize  dnsize lsize   %full  type
         0    6   128K    16K   137M     512   645M   47.44  DMU dnode
        -1    1   128K  1.50K     1K     512  1.50K  100.00  ZFS user/group used
        -2    1   128K     1K     1K     512     1K  100.00  ZFS user/group used
         1    1   128K     1K     1K     512     1K  100.00  ZFS master node
         2    1   128K    512      0     512    512  100.00  SA master node
         3    1   128K     9K      0     512     9K  100.00  ZFS delete queue
         4    1   128K     2K     1K     512     2K  100.00  ZFS directory
         5    1   128K  1.50K     1K     512  1.50K  100.00  SA attr registration
         6    1   128K    16K     8K     512    32K  100.00  SA attr layouts
# zdb -dv vsd/xxxxxx | tail -n15
   1312075    1   128K  1.50K     1K     512  1.50K  100.00  ZFS plain file
   1312083    1   128K     2K     2K     512     2K  100.00  ZFS plain file
   1312084    1   128K     1K     1K     512     1K  100.00  ZFS plain file
   1312090    1   128K     2K     2K     512     2K  100.00  ZFS plain file
   1312091    1   128K     1K     1K     512     1K  100.00  ZFS plain file
   1312737    1   128K    512      0     512    512  100.00  ZFS directory
   1312779    1   128K    512      0     512    512  100.00  ZFS plain file
   1318012    1   128K     8K     4K     512     8K  100.00  ZFS plain file
   1321356    1   128K    512      0     512    512  100.00  ZFS plain file
   1321596    1   128K    512      0     512    512    0.00  ZFS plain file
   1321597    1   128K    512      0     512    512    0.00  ZFS plain file
   1321620    1   128K    512      0     512    512    0.00  ZFS plain file
   1321651    1   128K    512      0     512    512    0.00  ZFS plain file
   1321659    1   128K    15K  8.50K     512    15K  100.00  ZFS plain file


Code:
# zdb -dv vsdnew/xxxxxx | head -n15
Dataset vsdnew/xxxxxx [ZPL], ID 81, cr_txg 13, 56.0G, 621568 objects


    ZIL header: claim_txg 0, claim_blk_seq 0, claim_lr_seq 0 replay_seq 0, flags 0x0


                TX_WRITE            len   8384, txg 227138, seq 2315271
                TX_WRITE            len   8384, txg 227138, seq 2315272
                TX_WRITE            len   8384, txg 227138, seq 2315273
                TX_WRITE            len   8384, txg 227138, seq 2315274
                TX_WRITE            len   8384, txg 227138, seq 2315275
                TX_WRITE            len   8384, txg 227138, seq 2315276
                TX_WRITE            len   8384, txg 227138, seq 2315277
                TX_WRITE            len   8384, txg 227138, seq 2315278
                TX_WRITE            len   8384, txg 227138, seq 2315279
                TX_WRITE            len   8384, txg 227138, seq 2315280
                TX_WRITE            len   8384, txg 227138, seq 2315281
# zdb -dv vsdnew/xxxxxx | tail -n15
   1063598    1   128K    512      0     512    512    0.00  ZFS plain file
   1063599    1   128K    512      0     512    512    0.00  ZFS plain file
   1063600    1   128K    512      0     512    512    0.00  ZFS plain file
   1063601    1   128K    512      0     512    512  100.00  ZFS directory
   1108987    2   128K     8K    24K     512    24K  100.00  ZFS plain file
   1108989    2   128K     8K   428K     512   632K  100.00  ZFS plain file
   1108991    2   128K     8K    76K     512    80K  100.00  ZFS plain file
   1108993    2   128K     8K    24K     512    16K  100.00  ZFS plain file
   1108995    2   128K     8K    20K     512    16K  100.00  ZFS plain file
   1108997    2   128K     8K    68K     512    64K  100.00  ZFS plain file
   1108999    1   128K     7K     4K     512     7K  100.00  ZFS plain file
   1109001    1   128K  7.50K     8K     512  7.50K  100.00  ZFS plain file
   1109003    2   128K     8K    20K     512    16K  100.00  ZFS plain file
   1109005    2   128K     8K    32K     512    32K  100.00  ZFS plain file
 
Are you absolutely sure you do not change 50 GiB worth of data? Examine the, errm… used property (zfsprops(7)) of your snapshots.​
[…] There is a cronjob, create a new, daily snapshot and remove the oldest snapshot (keeping 7 days). […]
Are you absolutely sure the cron job does the task correctly? Double‑check zpool-history(8) that the series of commands are plausible.​
[…] System commands like du and df showing always (nearly) the right usage of the dataset, […]
One should keep in mind such tools report payload sizes. Do you actively use ACLs and extended attributes? Your rsync(1) command preserves them, but maybe it’s just for sake of completeness. I’d deactivate these features to eliminate a cause.

Last but not least: “Have you tried turning it off and on again?” ?‍♀️ Maybe there are open but already deleted files, I don’t know.​
 
A few comments / suggestions:
  • Can you export and re-import the pool (or reboot)? This will make sure you don't have any left-open files (As Kai Burghardt suggests, above) using space.
  • You mention setting recordsize and ashift together a few times; these are separate entities
    • recordsize sets the largest logical record that ZFS will process on the filesystem. This is used, for example, to chunk data for checksums and compression.
    • ashift sets the smallest physical allocation ZFS will make on the vdevs.
    • Note there is interplay; with a recordsize of 8k and an ashift of 12 (4k), the 8k record must be compressible by 50% (to fit into less than 4k) to have any benefit from compression.
    • Unless you have a good reason (benchmarked an important task and saw a performance boost), don't use an 8k recordsize; stick with the default 128k. Remember it sets a maximum; ZFS will use multiples of the 2**ashift-sized blocks up to the recordsize as needed. Almost everything is happier with larger recordsizes available to use. (Main counterexample is for a DB or VM workload with a known IO size where you want to avoid RMW space (w/snapshots) & IO overhead.)
    • Note setting the recordsize only impacts new writes (and not receives; see next point) to the filesystem.
  • A zfs send/recv won't change the record-size (unless > 128k w/o the -L option) , so you'll get the same effective record sizes as you had on the sending side on the receive side. (An rsync of the data, however, will be impacted by the recordsize on the receive side, and will also get the chance to write everything in as large a block as possible).
  • Where is your snapshot usage? You state you have automatic snapshots, but your 'zfs list -o space' shows all zeros for USEDSNAP? Is nothing ever overwritten or deleted on the filesystem? You can double-check with 'zfs get -r used pool/fs' to see if you have snapshots that are using space.
  • If you run zdb poolname, there should be a 'Block Size Histogram' section; that would be interesting to look at for the two systems, as well.
 
Sorry for no updates for a long time.
I went another way to show such 'lost' data, as rootbert pointed me to zdb.

That is the new situation:
I copied all data from a pool by send/receive to another pool of the same size. After that, all infos were the same. Still - in this case - all zfs tools showing about 70-80GB MORE used data than tools like df, du etc.
I'm aware that the userland tools have not all information about metadatas and alike, but 70-80GB??

Then I deleted ALL files and directorys from the copy pool. The fun is: NO files or directorys, but still a usedds of 77.4G:
(about 2.44G data is valid for the content of vsdnew itself)

Bash:
# zfs list -t all -o name,used,refer,space,creation,written -r vsdnew
NAME            USED  REFER  NAME           AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD  CREATION                WRITTEN
vsdnew         79,8G  2,44G  vsdnew          209G  79,8G         0   2,44G              0      77,4G  Mo. Juli 15 14:31 2024    2,44G
vsdnew/xxxxxx  77,4G  77,4G  vsdnew/xxxxxx   209G  77,4G         0   77,4G              0          0  Mo. Juli 15 14:31 2024    77,4G

The empty dataset shows still ~70.000 files from zdb:

Bash:
# zdb -dv vsdnew/xxxxxx | wc -l
   70257

Bash:
# zdb -dv vsdnew/xxxxxx | head -n15
Dataset vsdnew/xxxxxx [ZPL], ID 82, cr_txg 5, 77.4G, 70250 objects

    Object  lvl   iblk   dblk  dsize  dnsize lsize   %full  type
         0    6   128K    16K  45.7M     512  1.51G    2.21  DMU dnode
        -1    1   128K  3.50K     1K     512  3.50K  100.00  ZFS user/group used
        -2    1   128K     1K     1K     512     1K  100.00  ZFS user/group used
         1    1   128K     1K     1K     512     1K  100.00  ZFS master node
         2    1   128K    512      0     512    512  100.00  SA master node
         3    2   128K    16K   293K     512   672K  100.00  ZFS delete queue
         4    1   128K     2K      0     512     2K  100.00  ZFS directory
         5    1   128K  1.50K     1K     512  1.50K  100.00  SA attr registration
         6    1   128K    16K     8K     512    32K  100.00  SA attr layouts
         7    1   128K    512      0     512    512  100.00  ZFS directory
        24    1   128K    56K  21.5K     512    56K  100.00  ZFS plain file
       157    1   128K    512      0     512    512  100.00  ZFS plain file


vsd/xxxxxx is the original pool with all data:

Bash:
# zdb -dv vsd/xxxxxx | head -n15
Dataset vsd/xxxxxx [ZPL], ID 90, cr_txg 20, 267G, 703493 objects


    ZIL header: claim_txg 0, claim_blk_seq 0, claim_lr_seq 0 replay_seq 0, flags 0x0




    Object  lvl   iblk   dblk  dsize  dnsize lsize   %full  type
         0    6   128K    16K   194M     512  1.51G   22.16  DMU dnode
        -1    1   128K  3.50K     2K     512  3.50K  100.00  ZFS user/group used
        -2    1   128K     1K     1K     512     1K  100.00  ZFS user/group used
         1    1   128K     1K     1K     512     1K  100.00  ZFS master node
         2    1   128K    512      0     512    512  100.00  SA master node
         3    2   128K    16K   293K     512   672K  100.00  ZFS delete queue
         4    1   128K     2K     1K     512     2K  100.00  ZFS directory
         5    1   128K  1.50K     1K     512  1.50K  100.00  SA attr registration
         6    1   128K    16K     8K     512    32K  100.00  SA attr layouts

The mountpoint:

Bash:
# du -Axhs /home/vsdnew/xxxxxx
512B    /home/vsdnew/xxxxxx



So, where the hell is zfs (this is freebsd zfs from FreeBSD 12.0) storing all the obviously senseless data?
 
addition:

I went a little deeper and can show, that this pool has about 70.000(!!!!) wrong entries for files and directories:


When I take an object from the zdb list (no matter which) I get:
Bash:
# zdb -ddddd vsdnew/xxxxxx 6497
Dataset vsdnew/xxxxxx [ZPL], ID 82, cr_txg 5, 77.4G, 70250 objects, rootbp DVA[0]=<0:26f0cbca00:200> DVA[1]=<0:e93d6ae00:200> [L0 DMU objset] fletcher4 lz4 LE contiguous unique double size=800L/200P birth=2105L/2105P fill=70250 cksum=d28be407c:471f8f945c0:cf6cad2df143:1ae7fbabaae777


    Object  lvl   iblk   dblk  dsize  dnsize lsize   %full  type
      6497    2   128K   128K   645K     512   768K  100.00  ZFS plain file
                                               168   bonus  System attributes
 dnode flags: USED_BYTES USERUSED_ACCOUNTED
 dnode maxblkid: 5
 path path not found, possibly leaked
 uid     1008
 gid     1005
 atime  Sat Dec 17 10:38:21 2022
 mtime  Sat Dec 17 10:38:21 2022
 ctime  Mon Dec 19 09:28:32 2022
 crtime Sat Dec 17 10:38:21 2022
 gen  148107
 mode 100600
 size 770080
 parent 120918
 links  1
 pflags 40800000004
Indirect blocks:
               0 L1  0:1229a8a00:400 20000L/400P F=6 B=100/100
               0  L0 0:a663ae00:19c00 20000L/19c00P F=1 B=100/100
           20000  L0 0:a6674a00:1a800 20000L/1a800P F=1 B=100/100
           40000  L0 0:a6654a00:20000 20000L/20000P F=1 B=100/100
           60000  L0 0:a668f200:1bc00 20000L/1bc00P F=1 B=100/100
           80000  L0 0:a66aae00:1be00 20000L/1be00P F=1 B=100/100
           a0000  L0 0:a66c6c00:14e00 20000L/14e00P F=1 B=100/100


  segment [0000000000000000, 00000000000c0000) size  768K


No path can be found, but zfs is sure that this files is taking quota from the system.
How can this happen >70.000 times within a few month?



And much more important: how can one get zfs to CHECK it's entries to correct it's wrong informations?


As always: any help is welcome!
 
I thought I was losing free space on my Z2 setup, but finally figured out deletes were going to the recycle bin, even though I thought I had that disabled.
 
there is no recycle here. You may thing of freeNAS here - thats a difference. And 'path not found, possibly leaked' would not show up.
 
as posted:

Code:
vsdnew  /home/vsdnew    zfs rw,noatime  0   0
vsdnew/xxxxx   /home/vsdnew/xxxxxx zfs rw,noatime  0   0

Code:
/home/vsdnew
/home/vsdnew/xxxxxx
 
check the mountpoint inside the zpool, or boot into livecd and manually mount vsdnew/xxxxx as altroot and set the snapdir visible. Do not mount vsdnew on top of vsdnew/xxxxx


zfs get snapdir vsdnew/xxxxx
 
Back
Top