Solved Help with locally built stable kernel not finding zfs root

Hi, I have installed FreeBSD on an amd64 laptop, and the only issue with it is an apparently long-standing bug in support for eMMC storage (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211705#c3), which I would like to investigate. But I'm stuck even getting an unmodified but locally compiled GENERIC amd64 kernel to work.

I'm experienced with kernel development but less so with FreeBSD specifically. I'm not totally new though; I've successfully cross-compiled and booted some modified 14-CURRENT RISC-V kernels (working on driver support for a board) from a different computer.

I figured I should start work on this issue by making sure I could locally build and boot an amd64 kernel without any modifications, before actually trying any changes, but so far each kernel I have built locally fails to mount the ZFS root, with
Code:
Trying to mount root from zfs:zroot/ROOT/testkern []...
Mounting from zfs:zroot/ROOT/testkern failed with error 2: unknown file system
I get a mountroot prompt, but trying to specify the root again manually (or the default boot environment) has the same issue.

I happen to be using a zfs boot environment for testing, after creating it, mounting it, and using DESTDIR with installkernel to put the new kernel in place. It's not specific to this particular boot environment: I've also just done installkernel into the default and observed the same failure (then rolled back). I can see zfs.ko being loaded right at the start of the boot process (it's the first module loaded) so I know that's not getting lost.

Using a stock kernel works fine, no issues with zfs root (except the waking-from-sleep bug I want to eventually investigate).

The main thing I can think of is that I may not have the right version of the source tree, but I'm not sure what other versions to try. I neglected to install source when I first set up this system. bsdconfig just hangs when I try to use that to get the source, so I manually cloned the git repo. I'm running 13.2 (with the latest patches). I've tried doing this with the 13.2-STABLE branch and the releng/13.2 branch, but both yield the same results after clean builds. Can anyone point me in the right direction here, whether it's finding a more correct source branch for my system, or something I may be overlooking in the build process that might cause this problem? I mentioned earlier that I've been working with 14-CURRENT on another project, but that source is on a different machine, there are no 13.2 vs 14 mix-ups here.[/code]
 
How exactly are you building the kernel? What is in /etc/make.conf and /etc/src.conf? Are you using llvm16 port to build kernel by chance (it's known to break the kmods build)?
 
The easiest way to get a correct boot environment is to use bsdinstall zfsboot.
The gory details can be learned in /usr/src/usr.sbin/bsdinstall/scripts/zfsboot.
 
covacat I can see the zfs module loading when trying to boot the new kernel (it is the first module loaded after the freebsd loader hands control to the kernel)

yuripv79 I'm using LLVM15 (during build I can see invocations specifically of clang15 and such), building/installing from
Code:
/usr/src
with:

make -j2 buildkernel
make -j2 DESTDIR=/mnt/<where bectl mount stuck the mount for the new bootenv> installkernel

As I mentioned before, I've also tried this just installing into the current working boot environment by omitting DESTDIR.

I don't have a make.conf or src.conf in /etc.

allolatr I'm pretty sure the boot environment is not the issue, as even when I just install the classic way into the *current* boot environment I have the same issue
 
Go back to the first post and the error message:
Trying to mount root from zfs:zroot/ROOT/testkern []... Mounting from zfs:zroot/ROOT/testkern failed with error 2: unknown file system
That implies to me the boot loader does not recognize that boot environment as something it can work with.
It's been a long time so I don't recall the commands but I think there are commands in the loader that you can use to list recognizable bootable things.
It also seems to be saying "mount root" which implies it was able to at least get the kernel from somewhere. On ZFS I think "/boot/kernel" is part of a ZFS boot environment, so if it can get the kernel it should have a bootable "root".
 
Did you run `zpool upgrade -a`?
Or set any feature(s) via `zpool set`?

If so, and if any of the feature(s) enabled/activated is/are incompatible with /boot/kernel/zfs.ko and/or /boot/loader* and/or /sbin/zfs in the (old) boot environment, it cannot handle the pool anymore.

This is a easily-forgotton and fatal pitfall.
 
  • Like
Reactions: mer
mer The log excerpt is from the *kernel* boot log, I can see the handoff from the bootloader to the kernel, and zfs is in the modules loaded into memory *by* the bootloader

T-Aoki This is a pretty fresh install, I've not mucked with ZFS aside from creating a new boot environment
 
Does the result of
zfs get mountpoint,canmount zroot/ROOT/testkern
zpool get bootfs zroot

consistent with the location of path /boot/kernel and root filesystem / ?
 
during build I can see invocations specifically of clang15 and such
Could you be more specific here? There is no clang15 *binary* in base system, which would mean it's being used from the ports' llvm and could be a problem. There are /usr/lib/clang/15.x paths though and those are normal.
 
Does the result of
zfs get mountpoint,canmount zroot/ROOT/testkern
zpool get bootfs zroot

consistent with the location of path /boot/kernel and root filesystem / ?

My output from those commands is

$ zfs get mountpoint,canmount zroot/ROOT/testkern
NAME PROPERTY VALUE SOURCE
zroot/ROOT/testkern mountpoint / local
zroot/ROOT/testkern canmount noauto local
$ zpool get bootfs zroot
NAME PROPERTY VALUE SOURCE
zroot bootfs zroot/ROOT/default local

I don't actually know enough about modern ZFS (used it on OpenSolaris many years ago....) to know if these settings are appropriate. But again, I'm still skeptical that this is the issue (or at least, the only issue I guess) because if I just make installkernel with no DESTDIR, so install into the current, default boot environment, I encounter the same error. This is why I suspect it's a build or build-config issue.
 
Could you be more specific here? There is no clang15 *binary* in base system, which would mean it's being used from the ports' llvm and could be a problem. There are /usr/lib/clang/15.x paths though and those are normal.
Hmm, apparently I do have a binary package for llvm15 installed; presumably pulled in as part of a dependency.

Ah! You're right, this is likely at least part of the problem; I had naively set the compiler package to llvm15, not thinking that I picked up that particular habit building 14-CURRENT over the summer doing RISC-V work, but the system compiler for 13.2 is llvm *14*
Code:
$ clang --version
FreeBSD clang version 14.0.5 (https://github.com/llvm/llvm-project.git llvmorg-14.0.5-0-gc12386ae247c)
Target: x86_64-unknown-freebsd13.2
Thread model: posix
InstalledDir: /usr/bin
$ clang15 --version
clang version 15.0.7
Target: x86_64-portbld-freebsd13.2
Thread model: posix
InstalledDir: /usr/local/llvm15/bin

I'll go see if this is the issue...
 
Last edited by a moderator:
I know (got bitten by that myself) that at least current llvm16 from ports miscompiles kmods and you get that same "unknown file system" error as resulting zfs.ko is broken.
 
This was the problem! Rebuilding with the correct compiler yields a working kernel. Somehow despite having checked out the 13.2-STABLE branch I've ended up with a 15-CURRENT kernel(???) but I'm sure I can figure that out. Thanks for the help everyone, especially yuripv79
 
Back
Top