UFS fschk segmentation error always (UFS)

sharing · Apr 3, 2025

4 disks of 20 terabytes in raid 0 - gstripe ufs. at different intervals during boot, gstripe error occurs, fschk disk check is enabled - segmentation error always occurs, there is no other way and no way to check the disk, segmentation error always occurs (in single-user, in multi-user, from live disk).

what am I doing wrong and how to fix it?

covacat · Apr 3, 2025

try to boot a newer os version (even -CURRENT) from external media and fsck it with the newer fsck
if it still fails
try to export the raid to iscsi and fsck from a netbsd/openbsd box (maybe their fsck is better)?
...
...
switch to zfs

sharing · Apr 3, 2025

covacat said:
try to boot a newer os version (even -CURRENT) from external media and fsck it with the newer fsck
if it still fails
try to export the raid to iscsi and fsck from a netbsd/openbsd box (maybe their fsck is better)?
...
...
switch to zfs

try to boot a newer os version (even -CURRENT) from external media and fsck it with the newer fsck = I tried it but it didn't help
switch to zfs - it seems that this is the only way out of this problem, but the system performance will drop by 25-50% - this is what I started with, due to the slowdowns I switched to UFS - as a result, the slowdowns went away and the system flies, or rather flew until the next gstripe crash and the inability to mark the disk (strip) as clean due to a segmentation error. the previous times I overcame the problem by recreating the stripe with the data transfer, this time there is nowhere to transfer the data and I will have to lose it.........

3301 · Apr 3, 2025

OS version? And before anything else would check memory (memtest or something similar) and run S.M.A.R.T tests.

sharing · Apr 3, 2025

3301 said:
OS version?

14.2-release

sharing · Apr 3, 2025

3301 said:
check memory (memtest or something similar) and run S.M.A.R.T tests.

I checked everything - the memory is fine, no failures, the smart is in excellent condition

cracauer@ · Apr 3, 2025

Would be good to see a gdb backtrace of the segfault.

sharing · Apr 3, 2025

cracauer@ said:
Would be good to see a gdb backtrace of the segfault.

I won't be wrong if I do this for to see a gdb backtrace of the segfault:
# gdb fsck -y /dev/stripe/gs0p1
?

PS: I apologize for going off topic, but once again something tells me that perhaps the fsck is not designed to check disks of such a large volume (there was a case when an old IP camera was recording via NFS to a storage, when a large disk was installed, the disk was initialized, formatted and worked for about a minute to record, then the camera crashed with a disk initialization error - by deduction I split the disk into 2 smaller ones and the problem went away). Maybe there is a similar situation here - in difficult moments my intuition has saved me more than once suggesting non-trivial solutions that turned out to be working solutions. Or maybe these are just coincidences...

cracauer@ · Apr 3, 2025

No, fsck should and does work with large filesystems. Yours isn't particularly large.

The workflow:

Code:

$ gdb fsck
(gdb) r -y /dev/stripe/gs0p1
# wait for segfault
(gdb) bt
# backtrace will be printed

Phishfry · Apr 3, 2025

sharing said:
what am I doing wrong

This:

sharing said:
4 disks of 20 terabytes in raid 0 - gstripe ufs.

Four disks of gstripe and:

sharing said:
, this time there is nowhere to transfer the data and I will have to lose it.........

Data that sounds valuable.

This is the wrong way to use RAID0. You use it on things like scratchdisk not something valuable.

Using it on a four disk array was suicide.

Phishfry · Apr 3, 2025

I would like to say if you have the need for speed buy a faster disk. I never saw linear speed advantage to multiple drive stripes. I left quite disapointed.
I tried gstripe while trying to soak 40GB ethernet with NVMe on PCIe3.

cracauer@ said:
/dev/stripe/gs0p1

Whats your usecase? You use stripes with NVMe or SSD or HDD?

Since the OP is using such large storage I would imagine things like block size and stripe size would be important.

Phishfry · Apr 3, 2025

I also wanted to mention graid if you want to stick with UFS.

graid

man.freebsd.org

The former RAID on Motherboard tool now does everything.

sharing · Apr 4, 2025

cracauer@ said:
No, fsck should and does work with large filesystems. Yours isn't particularly large.

The workflow:

Code:

$ gdb fsck (gdb) r -y /dev/stripe/gs0p1 # wait for segfault (gdb) bt # backtrace will be printed

Reading symbols from fsck...
Reading symbols from /usr/lib/debug//sbin/fsck.debug...
(gdb) r -y /dev/stripe/gs0p1
Starting program: /sbin/fsck -y /dev/stripe/gs0p1
[Detaching after vfork from child process 53943]
** /dev/stripe/gs0p1
** Last Mounted on /storage
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
fsck: /dev/stripe/gs0p1: Segmentation fault
[Inferior 1 (process 53941) exited with code 01]
(gdb)

P.S: fsck always runs on next boot, disk status is not clean. and so on in a circle as in a cycle, no error correction occurs, the disk is always dirty - the system after the error falls into single-user mode and there is the same thing. to boot the system you have to edit fstab and mount the disk manually in the booted system.

sharing · Apr 4, 2025

Phishfry said:
I also wanted to mention graid if you want to stick with UFS.

graid

man.freebsd.org

The former RAID on Motherboard tool now does everything.

fake raid motherboard is also not an option, under load it falls apart (under Windows OS it can withstand any load and does not fall apart), that's why I assembled gstripe on software raid.

sharing · Apr 4, 2025

Phishfry said:
This:

Four disks of gstripe and:

Data that sounds valuable.

This is the wrong way to use RAID0. You use it on things like scratchdisk not something valuable.

Using it on a four disk array was suicide.

maybe you are right about everything and I made a stupid mistake, but this option was a dream of a lifetime for me... (someone dreams of an expensive car, someone of a huge villa, and I dreamed of doing exactly this and not knowing problems, but it didn't work out)

sharing · Apr 4, 2025

Phishfry said:
I would like to say if you have the need for speed buy a faster disk. I never saw linear speed advantage to multiple drive stripes. I left quite disapointed.
I tried gstripe while trying to soak 40GB ethernet with NVMe on PCIe3.

Whats your usecase? You use stripes with NVMe or SSD or HDD?

Since the OP is using such large storage I would imagine things like block size and stripe size would be important.

hdd - Seagate Exos X20

cracauer@ · Apr 4, 2025

sharing said:
Reading symbols from fsck...
Reading symbols from /usr/lib/debug//sbin/fsck.debug...
(gdb) r -y /dev/stripe/gs0p1
Starting program: /sbin/fsck -y /dev/stripe/gs0p1
[Detaching after vfork from child process 53943]
** /dev/stripe/gs0p1
** Last Mounted on /storage
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
fsck: /dev/stripe/gs0p1: Segmentation fault
[Inferior 1 (process 53941) exited with code 01]
(gdb)

P.S: fsck always runs on next boot, disk status is not clean. and so on in a circle as in a cycle, no error correction occurs, the disk is always dirty - the system after the error falls into single-user mode and there is the same thing. to boot the system you have to edit fstab and mount the disk manually in the booted system.

OK, gdb detached. Time for a new plan to get a backtrace, this time via a coredump.

When you run fsck, make sure you are in a directory that is writeable, e.g. on a USB stick. It should now write a coredump on sigsegv. Then get the backtrace like this:

Code:

$ cd <somewriteabledir>
$ fsck -y /dev/stripe/gs0p1
# should report that it dumped core
$ gdb - fsck.core
(gdb) bt

peter-h · Monday at 4:23 PM

fsck has memory needs. The "segmentation fault" makes me believe that you need more memory in order to fsck this disk(s)

cracauer@ · Monday at 4:39 PM

A lack of RAM (physical memory) generally doesn't lead to a SEGV (a virtual memory concept).

I'm not sure the OP is still around.

USerID · Tuesday at 8:55 AM

Phishfry said:
I never saw linear speed advantage to multiple drive stripes.

On my modest PC with ZFS, a stripe of two SSD drives (32GBx2) is very helpful. The performance is noticeable even without tests. But I keep the data in another storage room. Yes, you can't rely on this (crash of important and valuable data), but the gain on the ZFS stripe is significant.

Phishfry · Tuesday at 9:40 PM

sharing said:
I dreamed of doing exactly this and not knowing problems, but it didn't work out

There is certainly nothing wrong with this. I used Dell PCIe adapters for dual M.2 and striped those. Hoping for 8GB/sec. to soak 40GB eth lagged.
Then PCIe4 drives came out and Samsung AIC does 7GB/sec.

The only bad part of your experiment was using not backed up data. Personally I didn't lose anything but I have learned my lessons. Many times.