Solved Firefox crashes entire system / kernel panic - how can that happen ?

I hope I can post this issue for my understanding how the bsd-kernel works.

I have installed FreeBSD 12.1 for non-professional use.

My question is, how it can happen that starting FireFox as an unprivileged user brings down the whole system with "kernel panic" messages. See attached picture for these messages. And what can I do to prevent this from happening again.

From the first time this happened, several days after installation including package firefox, it has been impossible NOT to reproduce the issue: starting firefox under a newly added user also caused the same kind of crash: X freezes as soon firefox shows the requested page, and a few seconds later the system reboots. Switching to a root console before that, I could take the enclosed picture of the error messages.
Code:
panic:   ffs_valloc:dup alloc
(skipped, kdb backtrace: )
#0 (address) at kdb_backtrace+0x67
            at vpanic+0x19d
        at panic+0x43
        at ffs_valloc+0x8d3
        at ufs_makeinode+0xa7
        at ufs_create+0x34
        at VOP_CREATE_APV+0x76
        at vn_open_cred+0x2bc
        at kern_openat+0x213
        at amd64_syscall+0x354
Since the last startup before the crashes, ports mongoose and xapian had been added, but both programmes had already been working correctly for some time.
12.1-RELEASE FreeBSD 12.1-RELEASE r354233 GENERIC amd64
firefox from iso-image

In the end, I wiped the disk and reinstalled bsd. Up till now, firefox has not done any harm.
 

Attachments

  • FFcrashesKernel.jpeg
    FFcrashesKernel.jpeg
    166.1 KB · Views: 265
Last edited by a moderator:
Backtrace shows UFS paths, so there is (was) something wrong with your FS and/or there's a bug in UFS code. It's not really firefox, rather some filesystem access that caused that panic.
 
Just to expand more on what xtouqh mentioned: the panic string is useful here (ffs_valloc: dup alloc). Try searching the FreeBSD bugzilla for the given string.
But as you don't have the system any more maybe it's irrelevant now.
 
There are two comparable, very old, reports on bugzilla. It may well be a hardware issue, because this is a system on an old laptop.

I do not understand, however, how an unprivileged process can make the kernel reboot. I thought that whatever a user did, could just harm the part of the system it has rights to. Unless there is a bug somewhere, of course.
This could have been successful malware instead of problematic hardware.
 
It's not the issue in userspace that caused the panic, problem occurred in kernel.

Unprivileged process is communicating with kernel (e.g. syscall). Should there be a bug in the kernel user may stress the issue enough to gain priviledges and/or cause panic, depending on the bug.
 
I also see 'inum = 81665, fs= /' at the top of the screen.
This is file
/boot/kernel/if_urtw.ko
and it seems to be a usb/wlan driver. wlan is disabled in the bios, and has not been configured in the system. I'll try to run a script to kldstat at what time this object is loaded, if ever again.
 
I realized that afterwards...
I first thought that, right after installation of the same base on the same partition, with a newly made fs, the same files should have the same inums: it is all deterministic.
But afterwards the files may get shuffled of course. ( late thought: the fact that if_urtw does not make much sense on a wlan-less system, should have made me realize that the odds were, that this was not the same file at the same inode :rolleyes: )
 
Update.

The problem has come back twice.
This time I checked the system before replacing it. I noticed that firefox crashed under different users, but always with the same inum on the / partition. There is no file on that partition with that inum. However fsck shows:
UNREF FILE I=82317 OWNER=me MODE=100400
ALLOCATED FILE 82317 MARKED FREE
BLK(S) MISSING IN BIT MAPS

After firefox started crashing the system again, I tested a loop which created 100 files in /tmp, and this also took the system down, with the same inum and backtrace.
So the workaround was to avoid tempfiles on the root partition : I linked /tmp to /var/tmp.
I'll probably need to fsck the root partition from a rescue stick to free the inum again.
 
I don't think fsck will help you here. You said you reinstalled the system before and yet you still have the issue.
Is the crash always the same ? i.e. is the panic string the same as in your first post from October?

Can you share your setup? Such as: actual version running (output from uname -a), system layout ( df -m output ).
Does dmesg say something that would indicate any other problem ?
 
I suggest you run smartctl(8) on the drive. It's likely it has bad sectors and you keep hitting those. Time to replace the disk if that's the case.
 
The panic string is always the same (ffs_valloc: dup alloc).
I reinstalled the system from the same iso image as the last time,
FreeBSD youme.org 12.1-RELEASE FreeBSD 12.1-RELEASE r354233 GENERIC amd64
The crash occurs on the root partition,
/dev/ada0a 2 0 2 17% /
with remaining partitions
ada0d /usr, ada0e /var, ada0f /usr/local, ada0g /home

There are some errors in the message logs, but they do not seem to be related to the fs or hd.
After the crashes, of course, there is a
WARNING: / was not properly dismounted
during a few reboots, there is another
/: mount pending error: blocks ....
There are messages
pcib2: failed to allocate initial I/O port window: 0-0xfff
pcib0: could not evaluate _ADR - AE_NOT_FOUND
which are not related, I think.

The backtraces are almost identical, sometimes containing kern_openat, sometimes kern_mkdirat, depending on what I tried.

I have not checked all the smartctl options, but '-t short' outputs:
# 1 Short offline Completed without error 00% 29535

My only clue, well, what I think is a clue, is that the fsck output contains several lines
UNREF FILE I=82??? OWNER=you MODE=100400
but just one single line
ALLOCATED FILE 82317 MARKED FREE
BLK(S) MISSING IN BIT MAPS
which is the crash inum, but i have not had the time to look into that yet.
 
I have not checked all the smartctl options, but '-t short' outputs:
# 1 Short offline Completed without error 00% 29535
Do a long test too. Also check the values for Current_Pending_Sector and Offline_Uncorrectable using smartctl -a <drive>
 
It probably has a bad sector. The extended check does not find anything:
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 29536 -

But the -a option shows

Code:
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       1
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       1

It is an unreadable sector, I guess.
 
It's only one, but yeah. It's possible you kept hitting that same bad spot. In any case, I would replace it, it's not exactly trustworthy any more.
 
Back
Top