UFS fschk segmentation error always (UFS)

4 disks of 20 terabytes in raid 0 - gstripe ufs. at different intervals during boot, gstripe error occurs, fschk disk check is enabled - segmentation error always occurs, there is no other way and no way to check the disk, segmentation error always occurs (in single-user, in multi-user, from live disk).

what am I doing wrong and how to fix it?
 
try to boot a newer os version (even -CURRENT) from external media and fsck it with the newer fsck
if it still fails
try to export the raid to iscsi and fsck from a netbsd/openbsd box (maybe their fsck is better)?
...
...
switch to zfs
 
try to boot a newer os version (even -CURRENT) from external media and fsck it with the newer fsck
if it still fails
try to export the raid to iscsi and fsck from a netbsd/openbsd box (maybe their fsck is better)?
...
...
switch to zfs
try to boot a newer os version (even -CURRENT) from external media and fsck it with the newer fsck = I tried it but it didn't help
switch to zfs - it seems that this is the only way out of this problem, but the system performance will drop by 25-50% - this is what I started with, due to the slowdowns I switched to UFS - as a result, the slowdowns went away and the system flies, or rather flew until the next gstripe crash and the inability to mark the disk (strip) as clean due to a segmentation error. the previous times I overcame the problem by recreating the stripe with the data transfer, this time there is nowhere to transfer the data and I will have to lose it.........
 
OS version? And before anything else would check memory (memtest or something similar) and run S.M.A.R.T tests.
 
Would be good to see a gdb backtrace of the segfault.
I won't be wrong if I do this for to see a gdb backtrace of the segfault:
# gdb fsck -y /dev/stripe/gs0p1
?

PS: I apologize for going off topic, but once again something tells me that perhaps the fsck is not designed to check disks of such a large volume (there was a case when an old IP camera was recording via NFS to a storage, when a large disk was installed, the disk was initialized, formatted and worked for about a minute to record, then the camera crashed with a disk initialization error - by deduction I split the disk into 2 smaller ones and the problem went away). Maybe there is a similar situation here - in difficult moments my intuition has saved me more than once suggesting non-trivial solutions that turned out to be working solutions. Or maybe these are just coincidences...
 
I would like to say if you have the need for speed buy a faster disk. I never saw linear speed advantage to multiple drive stripes. I left quite disapointed.
I tried gstripe while trying to soak 40GB ethernet with NVMe on PCIe3.
/dev/stripe/gs0p1
Whats your usecase? You use stripes with NVMe or SSD or HDD?

Since the OP is using such large storage I would imagine things like block size and stripe size would be important.
 
No, fsck should and does work with large filesystems. Yours isn't particularly large.

The workflow:
Code:
$ gdb fsck
(gdb) r -y /dev/stripe/gs0p1
# wait for segfault
(gdb) bt
# backtrace will be printed

Reading symbols from fsck...
Reading symbols from /usr/lib/debug//sbin/fsck.debug...
(gdb) r -y /dev/stripe/gs0p1
Starting program: /sbin/fsck -y /dev/stripe/gs0p1
[Detaching after vfork from child process 53943]
** /dev/stripe/gs0p1
** Last Mounted on /storage
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
fsck: /dev/stripe/gs0p1: Segmentation fault
[Inferior 1 (process 53941) exited with code 01]
(gdb)


P.S: fsck always runs on next boot, disk status is not clean. and so on in a circle as in a cycle, no error correction occurs, the disk is always dirty - the system after the error falls into single-user mode and there is the same thing. to boot the system you have to edit fstab and mount the disk manually in the booted system.
 
I also wanted to mention graid if you want to stick with UFS.
The former RAID on Motherboard tool now does everything.
fake raid motherboard is also not an option, under load it falls apart (under Windows OS it can withstand any load and does not fall apart), that's why I assembled gstripe on software raid.
 
This:

Four disks of gstripe and:


Data that sounds valuable.

This is the wrong way to use RAID0. You use it on things like scratchdisk not something valuable.

Using it on a four disk array was suicide.
maybe you are right about everything and I made a stupid mistake, but this option was a dream of a lifetime for me... (someone dreams of an expensive car, someone of a huge villa, and I dreamed of doing exactly this and not knowing problems, but it didn't work out)
 
I would like to say if you have the need for speed buy a faster disk. I never saw linear speed advantage to multiple drive stripes. I left quite disapointed.
I tried gstripe while trying to soak 40GB ethernet with NVMe on PCIe3.

Whats your usecase? You use stripes with NVMe or SSD or HDD?

Since the OP is using such large storage I would imagine things like block size and stripe size would be important.
hdd - Seagate Exos X20
 
Reading symbols from fsck...
Reading symbols from /usr/lib/debug//sbin/fsck.debug...
(gdb) r -y /dev/stripe/gs0p1
Starting program: /sbin/fsck -y /dev/stripe/gs0p1
[Detaching after vfork from child process 53943]
** /dev/stripe/gs0p1
** Last Mounted on /storage
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
fsck: /dev/stripe/gs0p1: Segmentation fault
[Inferior 1 (process 53941) exited with code 01]
(gdb)


P.S: fsck always runs on next boot, disk status is not clean. and so on in a circle as in a cycle, no error correction occurs, the disk is always dirty - the system after the error falls into single-user mode and there is the same thing. to boot the system you have to edit fstab and mount the disk manually in the booted system.

OK, gdb detached. Time for a new plan to get a backtrace, this time via a coredump.

When you run fsck, make sure you are in a directory that is writeable, e.g. on a USB stick. It should now write a coredump on sigsegv. Then get the backtrace like this:
Code:
$ cd <somewriteabledir>
$ fsck -y /dev/stripe/gs0p1
# should report that it dumped core
$ gdb - fsck.core
(gdb) bt
 
Back
Top