ZFS Fixing partition alignment on a ZFS boot disk?

richard612 · Thursday at 10:31 AM

Code:

~ gpart show nda1
=>      34  62914493  nda1  GPT  (30G)
        34       345     1  freebsd-boot  (173K)
       379     66584     2  efi  (33M)
     66963   2097152     3  freebsd-swap  (1.0G)
   2164115  60748397     4  freebsd-zfs  (29G)
  62912512      2015        - happy? -  (1.0M)

I don't care about the first two partitions. I do care about swap and the zfs partition.

I was contemplating gpart backup, making edits, then gpart restore onto a new disk, then dd the contents across. But I don't think ZFS would take kindly to an unannounced lift-and-shift onto new LBAs.

Are zfs send and zfs recv the answer? Or perhaps create a mirror vdev from the old+new partitions, boot from the new device, then break the mirror at which point ZFS complains about the missing device for the rest of eternity?

This is a rather messy problem...

SirDice · Thursday at 10:33 AM

Please don't post pictures of text. Just copy/paste the actual text.

covacat · Thursday at 1:18 PM

if you move the partition towards the beginning of the disk and are mega-brave then you can dd it directly without a spare disk
you still have to do it by booting from external media
the theory is like
dd if=/dev/nda1p4 bs=1m |dd of=/dev/nda1 seek=1057 bs=1m
the 2nd dd overwrites what the 1st has already read

a power failure would foobar everything hard

then you fix the partition table so the zfs part starts at 1057m
the swap part would be shrinked with something less than 1mb

don't try this at home

VladiBG · Thursday at 1:37 PM

covacat It's better to use a new disk to make a backup and then restore it.

richard612 · Thursday at 1:47 PM

It's a VM. I can snapshot it.

sko · Thursday at 1:56 PM

Never dd zfs providers! For one it circumvents all integrity checks and self-healing capabilities of zfs and may cause damage to the vdev or even the whole pool (e.g. due to single-bit errors in zfs metadata during the dd - been there, wasn't funny) and secondly it is horribly inefficient compared to a proper zfs resilver.

Add a new disk (image) to the VM, create the GPT table with properly aligned partitions and add the bootcode/uefi partition etc, then zpool attach(8) the new zfs partition to the existing vdev and let it resilver. After resilvering zpool detach(8) the old 'misaligned' provider. Shutdown the VM, remove old disk/image from VM configuration, start VM, collect underpants, then profit.

That being said - depending on what kind of disk image the hypervisor is using, the partition alignment is completely irrelevant anyways...

Emrion · Thursday at 2:44 PM

covacat said:
then you fix the partition table so the zfs part starts at 1057m

Out of curiosity, how do you fix the partition table? Is there a tool for that?
I said that because it 's somewhat complex to do by hand.

I imagine:

- Copy the fourth entry at the first place.
- Change the LBA start and end of the partition in this entry.
- Clear to zero the others entries (it's not clear if we get rid of the swap partition or not).
- Recompute the checksum (CRC32) of the table.
- Copy at the end of the disk this new table and correct the checksum in the corresponding header.

VladiBG · Thursday at 2:59 PM

I can write you a short how-to migrate the zpool from one disk to another using send/recieve. As this is VM why you are using ZFS on it and what is your expectation of aligning the partitions on the virtual disk?

covacat · Thursday at 3:18 PM

Emrion said:
Out of curiosity, how do you fix the partition table? Is there a tool for that?
I said that because it 's somewhat complex to do by hand.

I imagine:

- Copy the fourth entry at the first place.
- Change the LBA start and end of the partition in this entry.
- Clear to zero the others entries (it's not clear if we get rid of the swap partition or not).
- Recompute the checksum (CRC32) of the table.
- Copy at the end of the disk this new table and correct the checksum in the corresponding header.

you can just delete and recreate with another starting block[/size] using gpart

richard612 · Thursday at 3:24 PM

I just had the bright idea to set up a new installation from scratch, mount my old zroot to the new install, and copy over my scripts, confs, tunables, and whatnot. DEAR LORD what a trainwreck when two zroot pools are present. ZFS is bratty and I hate it a little right now. There's got to be a clean way to do this...

cracauer@ · Thursday at 3:25 PM

In the past I moved partitions because of alignment nags. They never got any faster. I guess don't really understand all the bits involved. Maybe it isn't an exact science.

SirDice · Thursday at 3:47 PM

richard612 said:
what a trainwreck when two zroot pools are present.

Pass an alternate name when importing.

Code:

 If newpool is specified, the pool is
             imported using the name newpool.  Otherwise, it is imported with
             the same name as its exported name.

zpool-import(8)

covacat · Thursday at 3:49 PM

too many translation layers in between. if the disk has native 4k and emulates 512b then it might be the case that a fs block of 16k spawns 5 physical blocks instead of 4 if alignment is not optimal
same with ssd write/erase zones which are 32k or something

angry_vincent · Thursday at 4:54 PM

VladiBG said:
I can write you a short how-to migrate the zpool from one disk to another using send/recieve. As this is VM why you are using ZFS on it and what is your expectation of aligning the partitions on the virtual disk?

Yes, please.

VladiBG · Thursday at 7:03 PM

Here's how you can migrate your ZFS zroot to a new disk (da1). First you will need a USB with the same FreeBSD version of the current running OS.
Boot from the FreeBSD installation and select Live system. Then login with root w/o password

#The current running da0 disk which we going to migrate to a new disk (da1)

Code:

# gpart show
=>       40  266338224  da0  GPT  (127G)
         40     532480    1  efi  (260M)
     532520       1024    2  freebsd-boot  (512K)
     533544        984       - free -  (492K)
     534528    4194304    3  freebsd-swap  (2.0G)
    4728832  261607424    4  freebsd-zfs  (125G)
  266336256       2008       - free -  (1.0M)

# camcontrol devlist
<Msft Virtual Disk 1.0> at scbus0 target 0 lun 0 (pass0,da0)
<Msft Virtual Disk 1.0> at scbus0 target 0 lun 1 (pass1,da1)

# Create a new partitioning scheme on the new disk

gpart create -s gpt da1

# Add a new efi system partition (ESP)

gpart add -a 4k -l efiboot0 -t efi -s 260M da1

# Format the ESP

newfs_msdos da1p1

# Add new Boot partition for Legacy boot (BIOS)

gpart add -a 4k -l gptboot0 -t freebsd-boot -s 512k da1

# Add the protective master boot record and bootcode

gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 2 da1

# Create new swap partition

gpart add -a 1m -l swap0 -t freebsd-swap -s 2G da1

# Create new ZFS partition to the rest of the disk space

gpart add -a 1m -l zfs0 -t freebsd-zfs da1

# mount the ESP partition

mount_msdosfs /dev/da1p1 /mnt

# Create the directories and copy the efi loader in the ESP

mkdir -p /mnt/efi/boot
mkdir -p /mnt/efi/freebsd
cp /boot/loader.efi /mnt/efi/boot/bootx64.efi
cp /boot/loader.efi /mnt/efi/boot/loader.efi

# Create the new UEFI boot variable and unmount the ESP

efibootmgr -a -c -l /mnt/efi/boot/loader.efi -L FreeBSD-14
umount /mnt

# Create mountpoint for zroot and zroot_new

mkdir /tmp/zroot
mkdir /tmp/zroot_new

# Create the new ZFS pool on the new disk (zroot_new)

zpool create -o altroot=/tmp/zroot_new -O compress=lz4 -O atime=off -m none -f zroot_new da1p4

# Import the original zroot

zpool import -R /tmp/zroot zroot

# Create a snapshot and send it to the zroot_new on the other disk.

zfs snapshot -r zroot@migration
zfs send -vR zroot@migration | zfs receive -Fdu zroot_new

# Export zroot and zroot_new and import again zroot_new under the new name (rename the zpool_new to zroot)

zpool export zroot
zpool export zroot_new
zpool import -R /tmp/zroot zroot_new zroot

# Set the default boot

zpool set bootfs=zroot/ROOT/default zroot

# cleanup the snapshot created for migration

zfs list -t snapshot -H -o name | grep migration | xargs -n1 zfs destroy

# export the pool

zpool export zroot

# Shut down and remove the old disk

shutdown -p now

# After the reboot select FreeBSD-14 from the UEFI and if everything is ok clean up the old UEFI record using efibootmgr

cy@ · Thursday at 7:28 PM

cracauer@ said:
In the past I moved partitions because of alignment nags. They never got any faster. I guess don't really understand all the bits involved. Maybe it isn't an exact science.

Back in the day when disks' geometry mattered and LBA wasn't a thing, alignment was a thing. Today it doesn't matter. I ignore the nags.

sko · 2025-02-14T08:09:33+0000

cy@ said:
Back in the day when disks' geometry mattered and LBA wasn't a thing, alignment was a thing. Today it doesn't matter. I ignore the nags.

On flash drives it really doesn't matter - the ancient concept of blocks and sectors doesn't apply any more. those drives just pretend like they are structured like magnetic drums from 70 years ago, but their actual IO patterns are completely different and managed by their firmware - so they absolutely don't care if your IO is 512 bytes of a fictional mapping earlier or later because your 512 or even 4k chunks are comically small for them anyways.

Roughly the same applies for VMs sitting on non-raw disk images that have their own internal data structure and possibly even some compression going on...

cy@ · 2025-02-14T21:23:48+0000

sko said:
On flash drives it really doesn't matter - the ancient concept of blocks and sectors doesn't apply any more. those drives just pretend like they are structured like magnetic drums from 70 years ago, but their actual IO patterns are completely different and managed by their firmware - so they absolutely don't care if your IO is 512 bytes of a fictional mapping earlier or later because your 512 or even 4k chunks are comically small for them anyways.

Nice that you restated what I just said. Thank you for reinforcing that.

sko said:
Roughly the same applies for VMs sitting on non-raw disk images that have their own internal data structure and possibly even some compression going on...

Ditto.

ZFS Fixing partition alignment on a ZFS boot disk?

richard612

Attachments

SirDice

Administrator

covacat

VladiBG

richard612

sko

Emrion

VladiBG

covacat

richard612

cracauer@

SirDice

Administrator

covacat

angry_vincent

VladiBG

cy@

sko

cy@