As I predicted, out of tree bcachefs is basically dead on arrival - everybody interested is already on ZFS, btrfs is still around only because ZFS can't be mainlined basically
ZFS is extremely annoying with the way it does extend and the fact that you can’t mismatch drive size. It’s not a panacea. There clearly is space for an improved design.
[1]: https://hexos.com/blog/introducing-zfs-anyraid-sponsored-by-...
I then had to manually delete the file with the I/O error (for which I had to resovle the inode number it barfed into dmesg) and try again - until the next I/O error.
(I'm still not sure if the disk was really failing. I did a full wipe afterwards and a full read to /dev/null and experienced no errors - might have just been the meta-data that was messed up)
Over the last one or two years I've experienced twice a checksum mismatch on the file storing the memory of a VMWare Workstation virtual machine.
Both are very likely bugs in Btrfs, and it's very unlikely that have been caused by the user (me).
In the relatively far past (around 5 years ago), I've had the system (root being on Btrfs) turning unbootable for no obvious reason, a couple of times.
Despite not being directly used, these blocks are kept (and cannot be reused) because another part of the extent they belong to is actually used by files.
This can happen if a large file is written in one go, and then later one block is overwritten - btrfs may keep the old extent which still contains the old copy of the overwritten block.It's generally fine if you stay on the happy path. It will work for 99% of people. But if you fall off that happy path, bad things might happen and nobody is surprised. In my personal experience, nobody associated with the project seems to trust a btrfs filesystem that fell off the happy path, and they strongly recommend you delete it and start from scratch. I was horrified to discover that they don't trust fsck to actually fix a btrfs filesystem into a canonical state.
BCacheFS had the massive advantage that it knew it was experimental and embraced it. It took measures to keep data integrity despite the chaos, generally seems to be a better design and has a more trustworthy fsck.
It's not that I'd trust BCacheFS, it's still not quite there (even ignoring project management issues). But my trust for Btrfs is just so much lower.
bcachefs isn't going away.
The SuSE guy also reversed himself after I asked; Debian too, so we have time to get the DKMS packages out.
I don't plan on giving ZFS or other filesystems not designed for Linux another go.
Heck no native filesystem besides btrfs has compression, I'm saving HUNDREDS of GB with zstd compression on my machines with basically zero overhead
I've approached Kent two weeks ago and offered to help upstream changes into the kernel so he wouldn't have to interact with any of the kernel people ... he claimed without being able to explain why that having a go-between couldn't possibly help and that if he couldn't dictate the kernel release schedule he wouldn't want to ship the software anyway. And then proceeded to dunk on btrfs.
So, there are still behavioral issues here I take it? That is a bummer. This is not news to me, but I thought the situation has changed ever since.
So, the alternative is ZFS only, maybe HAMMER2. HAMMER2 does not look too bad either, except you need DragonflyBSD for that.
There's no inherent reason why a filesystem in userspace has to have a measurable performance impact over in-kernel, done right. It would be a lot of engineering work, though.
Context switches aren't that expensive when you're staying on the same core, it's IPIs (waking up a thread on another CPU) that's inherently expensive; and you need zero copy to/from the pagecache, which is tricky. But it totally could be done.
Modern Darwin is mostly a monolithic kernel, though more and more things are moving to userspace where possible (eg DriverKit).
One interesting side-effect of various spectre mitigations is silicon improving the performance of context switches. That has the side-effect of decreasing the cost of kernel/userspace transitions. It isn't nearly as expensive as people still believe - though it isn't free either.
This is what bcachefs is based on.
> You'll need make-bcache from the bcache-tools repository. Both the cache device and backing device must be formatted before use.
So, it's far from overlayfs. I could accept formatting the cache device, but not the backing storage.
I got partway thru setting up a script to copy recently accessed files from the HDD to the read-prioritized SSD.
My LLMs load up way faster, and I still have a source of truth volume in the huge HDD. It’s not something I’d use professionally though, way too janky.
Don't do RAID 5. Just don't. That's not just a btrfs shortcoming. I lost a hardware RAID 5 due to "puncture" which would have been fascinating to learn about if it hadn't happened to a production database. It's an academically interesting concept but it is too dangerous especially with how large drives are now, if you're buying three, buy four instead. RAID 10 is much safer especially for software RAID.
Stop parroting lies about btrfs. Since it became marked stable, it has been a reliable, trustworthy, performant filesystem.
But as much as I trust it I also have backups because if you love your data, it's your own fault if you don't back it up and regularly verify the backups.
In the last 10 years, btrfs:
1. Blew up three times on two unrelated systems due to internal bugs (one a desktop, one a server). Very few people were/are aware of the remount-only-once-in-degraded "FEATURE" where if a filesystem crashed, you could mount with -odegraded exactly only once, then the superblock would completely prevent mounting (error: invalid superblock). I'm not sure whether that's still the case or whether it got fixed (I hope so). By the way, these were on RAID1 arrays with 2 identical disks with metadata=dup and data=dup, so the filesystem was definitely mountable and usable. It basically killed the usecase of RAID1 for availability reasons. ZFS has allowed me to perform live data migrations while missing one or two disks across many reboots.
2. Developers merged patches to mainline, later released to stable, that completely broke discard=async (or something similar) which was a supported mount option from the manpages. My desktop SSD basically ate itself, had to restore from backups. IIRC the bug/mailing list discussions I found out later were along the lines of "nobody should be using it", so no impact.
3. Had (maybe still has - haven't checked) a bug where if you fill the whole disk, and then remove data, you can't rebalance, because the filesystem sees it has no more space available (all chunks are allocated). The trick I figured out was to shrink the filesystem to force data relocation, then re-expand it, then balance. It was ~5 years ago and I even wrote a blog post about it.
4. Quota tracking when using docker subvolumes is basically unusable due to the btrfs-cleaner "background" task (imagine VSCode + DevContainers taking 3m on a modern SSD to cleanup 1 big docker container). This is on 6.16.
5. Hit a random bug just 3 days ago on 6.16, where I was doing periodic rebalancing and removing a docker subvolume. 200+ lines of logs in dmesg, filesystem "corrupted" and remounted read-only. I was already sweating, not wanting to spend hours restoring from backups, but unexpectedly the filesystem mounted correctly after reboot. (first pleasant experience in years)
ZFS in 10y+ has basically only failed me when I had bad non-ECC RAM, period. Unfortunately I want the latest features for graphics etc on my desktop and ZFS being out of tree is a no-go. I also like to keep the same filesystem on desktop and server, so I can troubleshoot locally if required. So now I'm still on btrfs, but I was really banking on bcachefs.
Oh well, at least I won't have to wait >4 weeks for a version that I can compile with the latest stable kernel.
The only stable implementation is Synology's, the rest, even mainline stable, failed on me at least once in the last 10 years.
I had to disable quota tracking. It lags my whole desktop whenever that shit is running in the background. Makes it unusable on an interactive desktop.
????
> Don't do RAID 5.
Ah, OK, so not FUD
> Stop parroting lies about btrfs.
I seee
Claiming that anyone reporting problems is lying is acting in bad faith and makes your argument weaker.
Also, "works for me" isn't terribly convincing.
I've been in a similar situation, letting everyone know I was fired. Apparently in the US this has a negative connotation, and they use "being let go" (or something confusing as "handing in/being handed your 2 weeks notice", a concept completely unknown here). Here we only have one word for "your company terminating your employment", and there is no negative connotation associated with it. This can be difficult for non-natives. We can come across very weird or less intelligent.
> If the above offended anyone, I sincerely apology them.
Unless this was tongue-in-cheek, this kind of proves the point that language was the cause. The apology is a good move in any case.
The revised version, "Once the bcachefs maintainer conforms to the agreed process and the code is maintained upstream again" is still lecturing and piling on, as the LWN comments say:
https://lwn.net/Articles/1037496/
It is the classic case of CoC people and their inner circle denouncing someone, and subsequently the entire Internet keeps piling on the target.
How cynical. It's the kernel maintainer, not the bcachefs maintainer, who does not behave and has a huge history of unprofessional behavior for decades.
it is not like he was not explicitly warned.
https://lwn.net/ml/all/bece61a0-b818-4d59-b340-860e94080f0d@...
The ever escalating drama and cynicism in the reactions this stuff gets though... bloody hell, what is with people these days?
https://lkml.org/lkml/2013/7/15/374
That is a reasonable compromise. Except when someone actually snaps back at him.
Human nature is wicked.