Linus Torvalds Reviews The Bcachefs File-System Code

As far as I know, BtrFS remains the default file system on SUSE's enterprise edition. Honestly, I don't understand that. On one hand, I have heard from many people that BtrFS has lots of bugs, many of which cause data loss. On the other hand, I assume the good people at SUSE are not fools, and would not ship a system that eats data to their paying customers. I don't know how to resolve that contradiction.
Perhaps under certain configurations, maybe simple configurations like a single device, it's fine and that's what SuSe defaults to? Maybe it only gets bad when you start trying to do non-default configurations?

Sun did not start from zero knowledge on filesystems :)
 
I think there is a very good argument "Windows does the same thing" to be made.
I largely agree, but I remember how NTFS would not corrupt itself and lose data when Windows inevitably crashed. I made it my mission in life to replace all FAT file systems at work with NTFS, just to save myself hundreds of hours a year. This why I'm staggered by the choice of FAT for UEFI.
 
I largely agree, but I remember how NTFS would not corrupt itself and lose data when Windows inevitably crashed. I made it my mission in life to replace all FAT file systems at work with NTFS, just to save myself hundreds of hours a year. This why I'm staggered by the choice of FAT for UEFI.
Well, I've seen windows VMs nuking their filesystem *on top of ZFS* - i.e. the volume was 100% consistent, yet windows managed to corrupt the NTFS filesystem and/or files within it...
Neither Windows nor NTFS give a crap about data integrity - both will happily pass along corrupted data (e.g. due to faulty RAM) until something blows up.
 
Perhaps under certain configurations, maybe simple configurations like a single device, it's fine and that's what SuSe defaults to? Maybe it only gets bad when you start trying to do non-default configurations?

Sun did not start from zero knowledge on filesystems :)

Yup, this: most people do agree that on one disk alone or some disks with RAID0/1 it works well enough and it's used mostly for the snapshot functionality. Everything else you should better not touch with a ten foot pole.
 
  • Like
Reactions: mer
Finally it happened: Bcachefs is now official part of the Linux kernel, starting with version 6.7.

So finally Linux has now aside the desaster area Btrfs which never delivered what it promised since 2007 a new, shiny native COW file system, whose whole purpose is to obsolete Btrfs and make things right. Well, we will see about that I guess.

The Bcachefs FAQ has for sure bold claims: Bcachefs is safer to use than btrfs and is also shown to outperform zfs in terms of speed and reliability

So, guess this has to be determined as well.

 
Finally it happened: Bcachefs is now official part of the Linux kernel, starting with version 6.7.

So finally Linux has now aside the desaster area Btrfs which never delivered what it promised since 2007 a new, shiny native COW file system, whose whole purpose is to obsolete Btrfs and make things right. Well, we will see about that I guess.

The Bcachefs FAQ has for sure bold claims: Bcachefs is safer to use than btrfs and is also shown to outperform zfs in terms of speed and reliability

So, guess this has to be determined as well.


How did it outperform ZFS in reliability?
 
How did it outperform ZFS in reliability?
It has 50 years of production uptime, at least. Pity it is the same 60 seconds over and over again.

Now seriously, only time can tell that. A  lot of time. ZFS is in production for how long?

In the mean time this statement tells us something about the person making these statements.
 
Bcachefs is an infant, it will take many years to prove itself. As of now, Linux does not have a reliable, modern file system.

It implicitly implies that there are other filesystems for Linux that do eat your data.
In fact, Linux ext family eats files. All you need to do is press reset button on the case of your machine. Ext4 is well known for this. And BTRFS is so buggy that it will probably never be fixed. For certain configurations, you are guaranteed to lose your data with it.

Neither Windows nor NTFS give a crap about data integrity - both will happily pass along corrupted data (e.g. due to faulty RAM) until something blows up.
Actually, it's true for Linux. As of Windows storage, your knowledge ends in 2011. ReFS can report files corrupted by bit-rot, if you enable this feature. And on mirrored or parity (except the first release of) Storage Spaces, it can repair them just like ZFS. There's even an integrity scrubber.
 
Wait, another example of "let's abandon this hot mess and create a new" one in the Linux world? Just like Pulseaudio to Pipewire. That principle is an extension of the Linux world mantra "this is way to stable and old school, let's replace it with something new for shit and giggles". Like replacing working init systems with systemd or X11 with Wayland.
 
Wait, another example of "let's abandon this hot mess and create a new" one in the Linux world? Just like Pulseaudio to Pipewire
I think Bcachefs is their want for a reliable alternative to ZFS. And not some flashy mess like Pipewire or PulseAudio, which Linux Musicians have complained about. The purpose behind them is different.

If they wanted to include ZFS for seamless use alongside the Linux kernel, all they would have to do is include a specific API exception or other linking exception exclusively for ZFS. GPL allows linking exceptions (at least for libraries) and exceptions at the API, so long as it's allowed and declared by the authors, in this case Linus Torvalds.

Bcachefs is unproven, except for performance and that it's more reliable than btrfs.

They got away from btrfs, so they did something right.
That principle is an extension of the Linux world mantra "this is way to stable and old school, let's replace it with something new for shit and giggles". Like replacing working init systems with systemd
For most Linux projects.
or X11 with Wayland.
Not Wayland, because Wayland has fundamental design improvements, and it allows the compositor to be anything the user or any organization wants it to be. That includes putting a FreeBSD implementation on top of Wayland. If anything on it starts getting clunky, then a project can be started without that. In Wayland, it's about the compositor, the toolkit and parts already from Xorg like existing graphics drivers. The compositor can be a BSD design or a Linux design.

Wayland wasn't Linux's, though, I dislike how Linux tries to take over so much including they now maintain Xorg which wasn't founded by them. The only good thing, is how they kept the license as is so far. If they changed the license, I'd see that as good, because it would force BSD projects to maintain their own version based on the permissively licensed one.
 
Bcachefs is an infant, it will take many years to prove itself. As of now, Linux does not have a reliable, modern file system.
The question what you make out of modern, but if it is not COW then XFS fits that bill quite nicely. And if you want COW, you can compile ZFS as external kernel module since ages. You can run ZFS under Linux, it's just not part of the kernel tree for obvious reasons, the CDDL vs. GPLv2 incompability. So ZFS is a second class citizen under Linux at best. Still the ZFS implementation nowadays used in FreeBSD is the same like on Linux, OpenZFS.

In fact, Linux ext family eats files. All you need to do is press reset button on the case of your machine. Ext4 is well known for this. And BTRFS is so buggy that it will probably never be fixed. For certain configurations, you are guaranteed to lose your data with it.
You get something fundamentally very wrong here: every modern file system with an enabled write cache - which is the default for most of them nowadays - will eat your data when you press the reset button - which is equal to a sudden power failure - before its write to disk phase, because the contents of the cache in RAM will simply be gone for good. It doesn't matter if this is UFS, ext, APFS, NTFS whatever - that data will be gone!

Journaling in file systems only guarantees the integrity of the metadata, nothing more.

If you want a file system which does survive pressing the reset button, than you've got to disable the write cache, which will heavily impact your systems performance for worse, because there is less room for write optimisation. And even then still pressing it will not save you from all of it, it will just reduce the damage because there's still a data loss windows there, its just smaller than before.

This is the reason why every important server has an UPS attached to it, so that in case of power failure the system still can write the cache to the data storage and shut down gracefully. Only by using an UPS you can feel on the safe side of having no data loss in case of a power failure.

Regarding BcacheFS: since Btrfs is just a pile of stinking garbage which never will be ready for anything and should just be deleted, I do appreciate that somebody took the effort and time to create an alternative to it which is now part of the mainline kernel. What its worth for or not, we will see in the future, there is room for tons of optimisations and also it will have to proof its claims. But still it is now nice to have one more option to choose from.
 
btrfs also had the caveat of not really being so much a passion project as it was a corporate cost cutting project by oracle et al to make something akin to Sun's zfs. Oracle just took the easy way out and purchased Sun.
 
The question what you make out of modern, but if it is not COW then XFS fits that bill quite nicely. And if you want COW, you can compile ZFS as external kernel module since ages. You can run ZFS under Linux, it's just not part of the kernel tree for obvious reasons, the CDDL vs. GPLv2 incompability. So ZFS is a second class citizen under Linux at best. Still the ZFS implementation nowadays used in FreeBSD is the same like on Linux, OpenZFS.
To make this comparison fair, I think we should stick to file systems that are native or ported to OS in question (ported means ZFS on FreeBSD or XFS on Linux). I can use ZFS on Windows, too. It's even easier than on Linux, no compilation is necessary.

You get something fundamentally very wrong here: every modern file system with an enabled write cache - which is the default for most of them nowadays - will eat your data when you press the reset button - which is equal to a sudden power failure - before its write to disk phase, because the contents of the cache in RAM will simply be gone for good. It doesn't matter if this is UFS, ext, APFS, NTFS whatever - that data will be gone!

Journaling in file systems only guarantees the integrity of the metadata, nothing more.

If you want a file system which does survive pressing the reset button, than you've got to disable the write cache, which will heavily impact your systems performance for worse, because there is less room for write optimisation. And even then still pressing it will not save you from all of it, it will just reduce the damage because there's still a data loss windows there, its just smaller than before.
Theoretically you're right. But practice shows how behaviour of different file systems varies at hard reset.

On Windows with NTFS, I have never seen corruption. Only file(s) that were written during crash/reset are missing, truncated, or filled with zeros/garbage.
Of course, I always use write-back, since this feature was introduced in 2003. The only difference is number of missing new files. You can copy multiple (small) files to NTFS, when the app will show it's finished, press reset button and they'll all be gone. The rest of file system should always be consistent even with write-back enabled.
And with Storage Spaces it could only be better, because S.S. have it's own journal.

On crash with journaled UFS on Solaris, I seldom saw minor inconsistencies, although less than sometimes on Windows + FAT.

On the other hand, ZFS in Solaris is 100% bulletproof. I did torture tests on one machine, hundreds of hard resets during intensive write operations. Result: no single inconsistence, only boot time is longer.

And what often happens on crash in Linux? It's simply sad.
 
No, it's not. I'm making checksums of important data files, and often keeping old data in archives (e.g. Rar) - nothing missing or changed will go unnoticed.
 
Back
Top