The venn diagram of "people who want a modern copy-on-write filesystem with snapshots to manage large quantities of data" and "people who want a massive pool of fault-tolerant storage" (e.g. building a NAS) has some pretty significant overlap.
The latter is where BTRFS is still hobbled: While the RAID-0, RAID-1, & RAID-10 modes work absolutely fine, the RAID-5 & RAID-6 modes are still broken, with an explicit warning during mkfs time (and in the manpages) that the feature is still experimental and should not be used to hold data that you care about retaining. This has, and continues to, bite people, with terabytes of data loss (backups are important, people!). That then sours them on every other aspect of ever using BTRFS again.
> If you ignore explicit warnings at mkfs time and then get upset the warning was accurate, you can't really fully blame the file system for it.
Oh, no doubt. I agree.
> Just raid on a lower layer and btrfs on top.
That has its own set of problems. The conventional RAID solution on Linux (MD) also has some pretty terrifying corruption edge cases with RAID-5 and RAID-6 (as I explained in [1]) which will bite you if you're not aware of them and how to work around them.
A robust filesystem purpose-built for the task can only really be found in ZFS.
Won't silent corruption on the raid level be detected by the integrity checks in btrfs? It won't be able to automatically repair it, but it should give errors at least, right?
Yeah, that would be the "error detection at a higher level" (than MD) part. It'd still be on you to pull one drive at a time from the array until the errors go away (then you know which drive has the corrupted block in that stripe, and can remove the mdadm metadata from it and then re-add it to the array so that the kernel forces a clean resync, reconstructing the good block from the parity). Doing the "repair" action in MD would instead rewrite your good parity for now-corrupted data and you would have no means of recovering. MD can't know whether the data is bad or the parity is bad because it doesn't know what the data is supposed to look like; even if btrfs does have a checksum for it, that's on a higher, disconnected layer. All filesystems on top of a parity MD suffer from this same vulnerability; some of them won't even be able to tell you when a file has become corrupted (e.g. FAT32), leading to this corruption being persisted into backups.
If it were only one data block in one stripe I'd be confident re-adding the same drive (and have done so); this is overwhelmingly likely to be a transient error (e.g. bit rot on the drive or a RAM bit flip while writing; either in the drive itself or the machine's main memory) that won't recur.
The MD "check" action can confirm this (it will iterate every stripe and report all parity/data mismatches, so if it only reports one ...) and some distributions ship a cronjob that automatically does this on a monthly basis.
If it were a corrupt parity block in a stripe (i.e. a filesystem with strong error detection reports no errors but the MD check action still reports a data/parity mismatch), this is usually more indicative of a lost write during a re-write operation (e.g. the machine was powered off in the middle of updating the contents of a stripe), as the parity is written last -- i.e. the parity would be for the old data in that stripe, not the data as it is now.
The MD "repair" action (if you are ABSOLUTELYCERTAIN that it is the parity that is bad) will automatically correct this problem, which you should do, as the failure of a disk containing a data block within that stripe will then leave you with incorrectly calculated data that will then start showing up as filesystem errors (if you're fortunate enough to be using such a filesystem).
Of course all of the usual caveats about checking SMART statistics apply in determining whether a drive is still suitable for continued use. If the same drive kept showing up with the same problems, I'd retire it; if the drive starts reporting an increase in reallocated sector count, I'd retire it; and so on.
The latter is where BTRFS is still hobbled: While the RAID-0, RAID-1, & RAID-10 modes work absolutely fine, the RAID-5 & RAID-6 modes are still broken, with an explicit warning during mkfs time (and in the manpages) that the feature is still experimental and should not be used to hold data that you care about retaining. This has, and continues to, bite people, with terabytes of data loss (backups are important, people!). That then sours them on every other aspect of ever using BTRFS again.