ZFS Can't boot from zroot pool for no apparent reason

Hello to all,

unfortunately I'm facing infamous messages when booting, when I'm trying to recover friend's NAS, which I installed last year.

Code:
zio_read error: 5
ZFS: i/o error - all block copies unavailable
ZFS can't read MOS of pool zroot

It seems, it reaches up to stage 2, but gptzfsboot can't access the root pool.

Fortunately it served as a secondary server for archives, wasn't so heavily used and all user data are safely copied out for now.
I successfully booted USB stick with 13-2 release, mounted pool with altroot option on first attempt and read all the data.
Then I checked for some hardware errors, browsed trough old dmesgs and syslog messages at the filesystem, but found nothing.
All drives seems fine, no SMART incremented relevant counters, also no apparent glitches over few days with I/O load at my place.
I also tried fresh scrub without any encountered or corrected errors.

Hardware wise, it is older Gen 8 HP Microserver with four 8TB SATA drives from last year.
The pool consists of two mirrored vdevs (akin to conventional RAID 10). Pretty much standard setup from installer without any further customization.
Each drive has GPT table and three partitions - BIOS boot, geom mirror for swap and finally ZFS partitions up to end.
All datasets have common 128k block size.
As the server was in isolated network segment and was used just for seldom data offloading, it stayed on the same system version from the installation - 13.0-RELEASE-p10, no system upgrades with related snapshots were done after initial testing.
I tested clean unmount and export of pool from live environment, but it didn't help with boot.
I also tried to install fresh pmbr and gptzfsboot from latest 13.2-RELEASE, but it also didn't help, so I reverted back to original version.
I tried to check boot environments, but it doesn't seem to be possible from live environment even at chrooted system.. I always ends at bectl: libbe_init("") failed.
There is of course zroot/ROOT/default, and it seems to be fine.

I dumped all relevant details, which came to my mind there.. (it's long so I've made a gist)

As the error message is pretty much catch all I found and browsed maybe dozens of similar threads or questions for booting issues, but after several attempts, remedies mentioned there either doesn't seem to be applicable to my setup or won't help.
Of course I might overlooked something, so I will be glad for any further ideas.
I found only old issue with gtpzfsboot, which theoretically fits the system (BIOS boot only option, larger 8TB drives), but I'm not sure, if its applicable to me, especially if the system booted before.

In normal situation, I'd already re-installed the server, but it's a bit frustrating form I haven't found any cause of the issue so far (hardware error or even my configuration mistake will be a relief ;)).
Now I'm bit reluctant to repeat the same setup again with ZFS, as I don't have any clue, it it won't broke again.

Thanks,

Michal
 
Maybe I'll try to ask a different question. Have you encountered similar issues, when using larger >2TB drives for root pool on a system with gptzfsboot? (eg. when particular computer don't support EFI loader). Or such setups are rather common and shouldn't have any inherent issues?
 
Boot from another medium,
zpool import
zpool import -R ... zroot ...
zpool status -x
zpool status -v
zpool list -v
hardware problem ?
reinstall bootcode/loader ?
 
Alain,

thank you for the reply and hints.

Boot from another medium,
zpool import
zpool import -R ... zroot ...
zpool status -x
zpool status -v
zpool list -v

All is in detail in the gist, I linked at the first post. Everything seems to be fine. As I've mentioned, when I started live environment from USB stick, mounted the pool at first attempt and successfully copied all data from it.
Anyway here is output from commands, you've mentioned.

Code:
   pool: zroot
     id: 3252755368503340536
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.
 config:

        zroot       ONLINE
          mirror-0  ONLINE
            ada0p3  ONLINE
            ada1p3  ONLINE
          mirror-1  ONLINE
            ada2p3  ONLINE
            ada3p3  ONLINE
all pools are healthy
  pool: zroot
 state: ONLINE
  scan: scrub repaired 0B in 00:05:48 with 0 errors on Fri Nov  3 22:16:28 2023
config:

        NAME        STATE     READ WRITE CKSUM
        zroot       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ada0p3  ONLINE       0     0     0
            ada1p3  ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            ada2p3  ONLINE       0     0     0
            ada3p3  ONLINE       0     0     0

errors: No known data errors
NAME         SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
zroot       14.5T   118G  14.4T        -         -     0%     0%  1.00x    ONLINE  /mnt
  mirror-0  7.27T  59.4G  7.21T        -         -     0%  0.79%      -    ONLINE
    ada0p3  7.28T      -      -        -         -      -      -      -    ONLINE
    ada1p3  7.28T      -      -        -         -      -      -      -    ONLINE
  mirror-1  7.27T  59.0G  7.21T        -         -     0%  0.79%      -    ONLINE
    ada2p3  7.28T      -      -        -         -      -      -      -    ONLINE
    ada3p3  7.28T      -      -        -         -      -      -      -    ONLINE


hardware problem ?
reinstall bootcode/loader ?

I haven't noticed any HW issues during my testing. I tried also some writes to the pool, checked for logged SMART errors and nothing even slightly suspicious.
I believe loading of pmbr part (stage0) and and gptzfsboot part (stage 1+2) was and is correct, because the mentioned error happens in the moment, when gptzfsboot tries to assemble and start zroot pool and start /boot/zfsloader from there. Anyway I also previously tried to reinstall both pmbr and gptzfsboot, first attempt was with original 13.0-p10 versions from mounted pool, then I tried the latest versions from other 13.2-p5 system. No change in behavior, it still stucks at the same moment. I also tried to manually enter location at boot prompt.. like zfs:zroot/ROOT/default or full form of that with included loader path after another colon, but no dice, it can't access zroot pool.
The last attempt I tried - changing device paths in the pool from GUIDs (/dev/gptid/...) to normal BSD device nodes (ada0p3 or so) via zfs export, import with respective -d parameter. That went fine, but unfortunately also didn't help with booting.

Michal
 
Maybe I'll try to ask a different question. Have you encountered similar issues, when using larger >2TB drives for root pool on a system with gptzfsboot? (eg. when particular computer don't support EFI loader). Or such setups are rather common and shouldn't have any inherent issues?
To my knowledge, this should not be an issue. It's been a while since I've tried a boot pool on 2TB or greater but I think you should be ok.
Going back to your OP, it sounds like everything installed correctly and has been running fine for a while, through reboots, is that correct?
 
The disk & zpool looks good.

Yes, also to me.

Maybe an update of the O.S. or zpool upgrade without upgrading bootcode ?

There wasn't any scheduled or applied system upgrade before this happened. I've also read posts from some other users, who had similar issues, and they tracked it down to some incomplete system upgrade or upgrade with some issues. Like, you've mentioned - old bootcode from previous system version, or some other upgrade related issue, which they managed to fix with zfs checkpoints, force pool import and clean export.
But unfortunately it seems, none of that applies to this server. Plus I've already tried to reinstall pmbr and bootcode with two different versions.

To my knowledge, this should not be an issue. It's been a while since I've tried a boot pool on 2TB or greater but I think you should be ok.

Going back to your OP, it sounds like everything installed correctly and has been running fine for a while, through reboots, is that correct?

Yes, I've installed the server, tested it for a while, tried couple of reboots, if everything works and all settings are persistent. Then I left it pretty much intact, for about a year it worked, until they couldn't access it at network. Guided over phone, they tried ACPI initiated shutdown, then forced shutdown and cold start, but no dice. It was headless system, so when I gained physical access, I found the system can't start with the root pool.
Puzzling part is, the hardware seems to be ok, pool mounts fine without errors at live USB and old kernel and system messages doesn't indicate any issues (just usual stuff, like some info from samba and messages about log rotation).

As I far I understand that, all the loader custom settings from loader.conf or pool properties, where are stored boot environment parameters came later in boot sequence, eg. setup there shouldn't have any influence to this issue - it stucked earlier. According to its description the gptzfsboot is rather simple, it essentially works just with BIOS mapped drives, checks partition signatures from booted drive and when it finds freebsd-zfs, and finally tries to assemble respective pool before it hands control to full loader residing in /boot there. Tomorrow I'll likely try to further debug BIOS drive mapping.
 
"old kernel" what version? Live USB is running 13.2 and can access everything.
Perhaps updating the boot code from the live usb would help.
Boot from LiveUSB then something like (of course replace ada0 with the correct devices):

gpart bootcode -b /boot/pmbr ada0
gpart bootcode -p /boot/gptzfsboot -i 1 ada0
 
Thank you for the reply.

"old kernel" what version? Live USB is running 13.2 and can access everything.

My apologies, i wrote it badly and in ambiguous way. Sorry about that. Old kernel belongs to "old kernel messages" in my sentence. In another words, I've looked for possible issues at dmesg files /var from zroot, which I mounted in live USB system. Not that it booted from the pool with some old kernel.

Perhaps updating the boot code from the live usb would help.
Boot from LiveUSB then something like (of course replace ada0 with the correct devices):

gpart bootcode -b /boot/pmbr ada0
gpart bootcode -p /boot/gptzfsboot -i 1 ada0

Yes, I know about the procedure and already tried it, as I wrote earlier. I already reinstalled both pmbr and gptzfsboot, the for the first time with files from mounted zroot pool (13.0 release P10) and then also from newer 13.2 release, and finally reverted it back to older bootcode as it didn't help. Finally I concluded, that is very unlikely it is the culprit, because those first boot phases (0 and 1) are fine. As this code seems to be loaded correctly from first boot drive, but for some reason it can't "see" GPT partitions necessary to assemble the zroot pool and access the loader in /boot directory, as indicated by the ZFS: i/o error mentioned in the first post here.
 
  • Like
Reactions: mer
msmucr thanks. I realized after posting and re-re-reading everything, but figured I'd ask just to be really really sure.
I'm at my limit; I'm not aware of anything that should cause this, the fact that booting from live USB you can import and export the pool on the same hardware with no issues implies no hardware issues.
The only really weird thing I can think of:
Maybe related to zpool.cache? The live USB would create it's own /boot/zfs/zpool.cache when you import the existing pool, but booting from the device would use the existing /boot/zfs/zpool.cache which would likely happen at the point in the boot cycle you are describing.

How to fix? Not sure. Perhaps boot from the live USB, import then copy to the physical devices?
 
TL;DR - it's a hardware issue.

First of all thank you for replies and hints. I'm sorry for later update, I needed to find some time for further testing. I have some good news - the issue is related to particular server.

As I've mentioned it is older HP Microserver Gen 8, it doesn't support EFI boot.
Its onboard SATA controller is bit special - it has five SATA ports in total. The first two ports are SATA 3.0, the remaining three are SATA 2.0. Four port are connected to four 3.5 in slots in hard drive cage, the last one is meant for slim internal optical drive (not installed here). Then there are three operating modes of the controller.
- RAID mode (B110i)
This is classical software (fake, HW assisted) RAID, akin to Intel RSTe. Individual drives are visible for BSD or Linux, but not mapped to BIOS (via int 0x13), unless you define RAID-0 volume for each one.
- AHCI mode
First four ports are mapped as BIOS drives (0x80 up to 0x83), all ports visible in the OS under via two SATA controllers (4 + 1). NCQ fully supported.
- SATA legacy mode
All five ports are mapped as BIOS drives, all ports visible in the OS under via two SATA controllers (4 + 1). NCQ not supported (more to that later).

The last year when I installed FreeBSD 13.0, the mode was AHCI. This worked well for the installation, system updates, couple of reboots, all fine. However now the system returned to me as unbootable, without any obvious cause, as I've described above in the thread. After all attempts I narrowed it to moment, when gptzfsboot (the early stage bootstrapping loader) apparently can't "see" ZFS root pool. This stage came before partition with /boot is accessible (so any further loader.conf settings or pool cache are irrelevant), drives are BIOS mapped and accessed via interrupt 0x13 (no SATA or AHCI kernel driver takes place).
Then I recalled the SATA controller has mentioned three modes, so I tried to switch it from AHCI to Legacy mode. Now the bootstrap found the pool, continued to full loader (with ASCII Beastie screen) and it booted correctly to the system - what a surprise!
Following that I tried to find differences between two modes. I don't have many tools for examining BIOS mapped drives, but I recalled about recovery tool DMDE https://dmde.com, it has DOS version. So I've created bootable USB stick with FreeDOS + DMDE trial. When I tried to access drives in both modes, it seemed to be fine. In both modes, it saw all four drives (plus USB drive mapped as 0x80), read GPT tables, even accessed the start of freebsd-zfs partitions, random seeking there wasn't problem. In SATA legacy mode DMDE couldn't access LBAs up to full 8TB capacity.
Then covacat mentioned that lsdev from USB boot.. that was very useful hint, i didn't know it will show also partitions and ZFS pools. And indeed it finally shows the difference - although in both modes it lists all required partitions, in the AHCI mode there is missing ZFS pool. That's the culprit, why it couldn't boot.

Still it is more than weird, AHCI mode stopped to work for booting, when it was fine before. However I also found, the BIOS post screen shows warning about HP iLO (out of band access) self-test error. There is failed NAND flash used as NVRAM. Normally it shouldn't affect anything else than iLO functionality, so if one doesn't use it, it shouldn't matter, I thought. Wrongly in this case, with particular Gen 8 server, it also affects functionality of the controller, because with failed NVRAM it for example couldn't access setup tool for RAID mode. More about that there: https://op-co.de/blog/posts/microserver_gen8_fix_boot_order
So although it likely sounds super strange/silly ;) (I definitely didn't experienced anything like that before), my only explanation I currently came with is - the NAND flash fault (the only change on server) somewhat affected also BIOS mapped drives in AHCI mode, and bootstrap loader couldn't access ZFS partitions in time and find the root pool. Without further debugging of the bootstrap, it's likely not possible to find out more.

I've kept it legacy mode. I did upgrade to 13.2. It works well across reboots, no errors, warnings. Performance decreased a bit, because legacy mode effectively disables NCQ, despite linked blog article says otherwise. Although camcontrol shows 32 tags for devices, some analysis during I/O intensive workload (like scrub) shows about 20% drop compared to AHCI modes. I've also cross-checked that with bootable Linux USB and there sysfs shows queue_depth = 1 for devices.
The other option is to keep AHCI mode, but skip zfs boot and place whole /boot on UFS formatted stick and boot from USB, however it has other disadvantages. Performance in legacy mode is still sufficient IMHO for intended use case (it comfortably saturates gig ethernet with sequential IO).

Thanks again, it was my head worm for well over a week :)
 

Attachments

  • DMDE - AHCI.JPG
    DMDE - AHCI.JPG
    66.8 KB · Views: 92
  • DMDE - SATA legacy.JPG
    DMDE - SATA legacy.JPG
    61 KB · Views: 90
  • lsdev ahci.png
    lsdev ahci.png
    18.5 KB · Views: 89
  • lsdev legacy.png
    lsdev legacy.png
    18 KB · Views: 96
Wow, thanks, that's thorough and very thoughtful.

… switch it from AHCI to Legacy mode. Now the bootstrap found the pool, continued to full loader (with ASCII Beastie screen) and it booted correctly to the system - what a surprise!

… Still it is more than weird, AHCI mode stopped to work for booting, when it was fine before. …

For some reason (apologies if off-topic) I'm drawn to part of a 2017 comment in a 2010 report:

… If the Storage Configuration at BIOS is changed from AHCI to IDE, then the system boots. …
 
For some reason (apologies if off-topic) I'm drawn to part of a 2017 comment in a 2010 report:

Graham, thank you for the link. No off-topic at all, the whole thread is very interesting read.
Especially those last messages, where some people still have issues even with current bootstrapping loader with various older computers, pretty much regardless of brand. Allan's patches, which works up until FreeBSD version 12, helped them according to their reports.
For me it just shows, how much of historic ballast and potential incompatibilities carries handling of legacy BIOS drives. If I recall correctly, there at least three different LBA mapping schemes there to accommodate progressively larger drive capacities. Plus HW vendor implementations of int 0x13 calls can be sometimes pretty creative to put it mildly. I can remember woes in 90's during my high school days, when we all had computers with single HDD and multiboot managers for different systems were a must. Some of them had ability to self-install and detect partitions (for instance when booted from floppy first). If there were some address or capacity calculations and BIOS under certain conditions reported bogus values, it could be quite a (deadly) fun ;)
Not that EFI is always perfect and there can be certainly also issues, however in some areas it is a relief and definitely a needed step forward.
Bummer the friend's old HP server has only BIOS boot option, otherwise I'd likely place sole EFI boot partition on USB drive for booting of ZFS pool (with appropriate loader.env) and call that a day. That's definitely my preferred way, as the USB drive will be only boot device in system even in case of failure of first HDD (so I don't need to rely on EFI firmware detection of drive order or manually modify EFI boot vars). Additionally it avoids need for manual synchronization of EFI partitions after system loader update. So maybe in his next small server :)
 
Back
Top