https://arstechnica.com/gadgets/2020/01/linus-torvalds-zfs-s...
So cold data (cold write, cold/hot read) will take less and less space over time while still having the same read performance.
(It would also be a performance nightmare - you'd have a permanent indirection table you'd need to use for _everything_, and if you've ever seen how ZFS dedup performs with its indirection table not on dedicated SSDs, you can understand why this is terrible.)
* https://openzfs.github.io/openzfs-docs/Basic%20Concepts/dRAI...
Would be great for home use, where I have a lot of drives that I collected over the years that are not the same size.
EDIT: The more I read into this, it still seems assume that all drives must be of the same size.
That way, if one disk fails, the reserved space is used to write the data necessary to keep the array consistent. Because the free space is distributed randomly across the array, the write performance of a single drive doesn't become a bottleneck.
This is unrelated to the ability to remove drives from a pool (which is difficult to support in ZFS due to design constraints)
dRAID, Finally![0]
One thing I am wondering about is this:
> Redacted zfs send/receive - Redacted streams allow users to send subsets of their data to a target system. This allows users to save space by not replicating unimportant data within a given dataset or to selectively exclude sensitive information. #7958
Let’s say I have a dataset tank/music-video-project-2020-12 or something and it is like 40 GB and I want to send a snapshot of it to a remote machine on an unreliable connection. Can I use the redacted send/recv functionality to send the dataset in chunks at a time and then at the end have perfect copy of it that I can then send incremental snapshots to?
> Redacted send/receive is a three-stage process. First, a clone (or clones) is made of the snapshot to be sent to the target. In this clone (or clones), all unnecessary or unwanted data is removed or modified. This clone is then snapshotted to create the "redaction snapshot" (or snapshots).
Think of it like a selective sync in Dropbox or SyncThing at the FS level.
That's not to say rsync doesn't work. It does. But it doesn't scale well, and the data integrity guarantees aren't there.
btrfs seems like the main alternative if you want native kernel support, but when I checked a couple years ago there seemed to be a lot of concerns about the stability. Is that still the case?
[1] https://lore.kernel.org/linux-btrfs/20200627032414.GX10769@h...
[2] https://www.man7.org/linux/man-pages/man8/mkfs.btrfs.8.html#...
[1] https://lore.kernel.org/linux-btrfs/
[2] https://lore.kernel.org/linux-btrfs/CAD7Y51i=mTDnEWEJtSnUsq=...
[3] https://lore.kernel.org/linux-btrfs/CAMXR++KUj2L7qpR7QZeiM2T...
(But as others have pointed out, there are options for using zfs on linux, too)
1. It often happens that the main repo offers a new kernel, but the corresponding module is not ready on obs yet. This means upgrading to the latest rolling release cannot just happen at any time, but requires careful planning. This is a big inconvenience.
2. In the past dracut sometimes just failed to pick up the module for the initrd, causing a boot failure at the next system start. I could not figure out why, however this never happened with the first class supported ext/xfs.
3. The distro's boot/rescue media do not contain the driver. This means a third-party boot medium is required to go into a broken system, and repairing it when chroot is involved is now much more complicated because of the different distro.
A friend did a video based on my blog: https://www.youtube.com/watch?v=PILrUcXYwmc
Or you can use the latest Ubuntu that is shipped with ZFS.
For the most part, yes. Occasionally a kernel developer who seems to be bitter about a company that doesn't exist any more tries to break compat with ZFS, but it's generally smooth sailing on Fedora, Debian, and CentOS, with dkms handling the building of modules seamlessly.
Do we have encryption,yet?
Use BTRFS trust me it's stable now...well the commands are terrible compared to ZFS. All my Server are FreeBSD but on the Laptop and on one Workstation i have openSUSE Tumbleweed since like 2 years and it works great.
Really? I don’t think so, I find btrfs usage extremely straightforward and easy to grok. ZFS on the other hand has all that confusing lingo about vdevs, etc...
I get that this is subjective but I disagree.
I switched my freebsd box over to debian about two years ago. No complaints so far :)
For me, that gives a unicorn 100% of the time (tried across several minutes), instead of showing the developer profile.
Anyone else seeing that?
Many thanks to the various OpenZFS contributors.
I've seen people use it as a rootfs on RPis, and have personally run it on Pis for brief occasions without encountering any RAM problems.
(Sorry if noise; I'm just trying to get an idea of how relevant this 2.0 release is to me.)
Previously it was called ZFS on Linux, but now ZFS development is unified on the "OpenZFS" codebase shared both between Linux and FreeBSD as much of the development effort for ZFS in general ended up there.
I realized how bad the performance was when it took about 2 hours to delete 1000 files.
Deduplication is the process for removing redundant data at the block level, reducing the total amount of data stored. If a file system has the dedup property enabled, duplicate data blocks are removed synchronously. The result is that only unique data is stored and common components are shared among files.
Deduplicating data is a very resource-intensive operation. It is generally recommended that you have at least 1.25 GiB of RAM per 1 TiB of storage when you enable deduplication. Calculating the exact requirement depends heavily on the type of data stored in the pool.
Enabling deduplication on an improperly-designed system can result in performance issues (slow IO and administrative operations). It can potentially lead to problems importing a pool due to memory exhaustion. Deduplication can consume significant processing power (CPU) and memory as well as generate additional disk IO.
ZFS also has a huge legacy. Right now the license (probably) prevents you from legally shipping a compiled zfs module with the linux kernel, just solving that seems insurmountable. It's also supported on Illumos and FreeBSD, trying to refactor it to use the linux page cache would have a chance of introducing bugs to these platforms.