Correct Backups Require Filesystem Snapshots (opens in new tab)

(cyounkins.medium.com)

50 pointscyounkins4y ago74 comments

74 comments

49 comments · 15 top-level

liuliu4y ago· 8 in thread

Actually, I have a hard time to understand this. While snapshot reduces file corruptions, it is not a guarantee the best I understand.

A corrupted file can manifest itself in many ways. But ultimately, it has to manifest itself as a business logic error, i.e. you increased balance on one entity but didn't on another, causing sum of balances to change (a corruption).

Thus, any discussion on file corruptions without a file system that supports transaction, requires every application to use a competent database underneath (SQLite) at least.

And even with transactional support in a file system or using a database, you need every application to have the correct business logic that does the transaction correctly as well.

All-in-all, correct backup cannot be solved universally without knowing all the applications. The best we can do is to probabilistically avoid obvious issues, i.e. using FS-level snapshot.

kazinator4y ago

A snapshot captures some state of the filesystem that the filesystem actually had at some point. So when you recover that state, and then run the application, it's similar to the OS having crashed or the power having been lost.

Whereas non-snapshot backups don't have that property; the material in the backup is not necessarily identical to any past state of the filesystem that existed. It's something like a past state plus random roll-backs of files, or portions of files, from multiple previous states.

Guess which of these situations applications are much more likely to be able to recover from (if any at all)?

When people write crash recovery code, they typically assume that the world simply stopped, not that it stopped, and then some data was randomly rewound to unspecified older states.

I don't think that you can reasonably defend against data that is restored from a backup, where the oldest part of the backup is an hour older than the newest.

A backup is like a raster scan image of a fast moving object. What should be a rectangular train car looks like a parallelogram: it's not a picture of any scene that existed. Imagine that the raster lines are randomly sampled (not a progressive scan, or even interlaced) and now recover a sane image.

liuliu4y ago

I think we need to be a bit more concrete than analogies for my brain to process. So here we go:

If your application relies on multi-file collaborated persisted states (for example, an append-only log and a database snapshot), application can make reasonable assumptions on when a file state is committed, such as a database snapshot is `fsync`ed before the append-only log started. Ideally, even on a filesystem without transaction support, this order is preserved in time.

However, it may not be preserved for a "file-to-file" backup system because it can loop over the append-only log file first before the database snapshot, causing ordering issues. That could result a "ABA" problem, where the append-only log is corresponding to an older database snapshot, and potentially causes issues.

That has been said, is this a common scenario (multi-file persisted state) for applications? (I believe SQLite handles this particular ordering issue fine). I am not sure, and just want to call out filesystem snapshot solves a very particular problem.

If your application relies only one file for the state, or these files are orthogonal to each other (.xlsx or .docx or any .markdown, .mp4, .jpg files), it is a non-issue. And if you need to backup a database, you better do that with the database provided tools.

kazinator4y ago

I think it's not just between files, but within a file.

The backup process opens some N byte file and starts copying it. Some portion of the file is backed up until byte K. At that point the application writes a transaction whereby some of the data is written above K, and some below K. The backup continues and backs up half the transaction above K, combining that with the old data below K before the transaction happened.

If the application structures its scattered write transactions such that they write in increasing offset order, then maybe it's okay.

2 more replies

cyounkinsOP4y ago

Yes, you are correct in that it does not guarantee correctness. Point-in-time snapshots are necessary but not sufficient. Without them, the possible corruption scenarios are infinite and cannot be handled or even detected. With them, it is up to the applications to do the right thing in what is equivalent to a power loss event.

adrianmonk4y ago

One way of looking at it is that applications already need to be able to survive power loss, a crash of the computer, a crash of the application, or a forcible kill of the application. If they can survive those, they can probably survive a filesystem snapshot being taken at just the wrong moment.

kazinator4y ago

Right! But not necessarily a file by file backup.

a13692099934y ago

> While snapshot reduces file corruptions, it is not a guarantee the best I understand.

The idea is that you'll only encounter corruptions that the application could already hit due to crashes or power outages (and hence hopefully supports recovering from). For example, with naive reads, you might:

  read the first half of the file
  context switch to the application
  application writes to the first half of the file
  application fsyncs previous writes
  application writes to the second half of the file
  context switch back to you
  read the second half of the file

and end up with data in the second half of the file that the application normally only writes after it's sure that corresponding data has been written to the first half.

rzzzt4y ago

The Volume Shadow Copy Service mentioned at the end has support for application-level notifications via writers, the most prominent one being SQL Writer: https://docs.microsoft.com/en-us/sql/relational-databases/ba...

ozim4y ago· 7 in thread

Most of what is described there seems like "backup larping".

I don't need to backup my Discord and most likely I will be able to simply restore any iTerm configuration that I have in less time than setting up and keeping file system snapshots running.

Any DEVONThink or my excel spreadsheets or invoices / documents that I care about I can simply backup after I am done working on them. I usually work on one or two at the time. When my laptop dies I probably can just remember what was needed to be done and redo the work.

Restoring file system snapshot would usually be much more hassle than filling in single invoice again or downloading it from some provider again.

For web applications there is usually SLA where they specify how much data can be lost like 1 hour or 30 mins - but no one will realistically guarantee "no data lost" - because imagine doing full snapshot of 1 Terabyte drive it takes I suppose at least an hour anyway.

cyounkinsOP4y ago

Sure, I've been accused of being overly concerned with correctness. :-)

> because imagine doing full snapshot of 1 Terabyte drive it takes I suppose at least an hour anyway.

On copy-on-write filesystems, snapshots are nearly instantaneous.

> time than setting up and keeping file system snapshots running

As shown in the code snippet, pretty much all it takes is a few snapshot commands around the backup command and changing the source directory to the snapshot mount point.

ozim4y ago

Different people care about different things :)

I also think about snapshots differently.

For me snapshot is actual copy stored on a different hard drive or other medium. So copy operation is never going to be instant.

Scenario in the article is power loss so that is also not something I worry about that much. Mostly I worry about hardware failure where I would not be able to turn on my laptop/server again. For power loss UPS or battery in laptop is my go to solution.

2 more replies

LanternLight834y ago

These arguments apply to external backups, but if you use a filesystem like ZFS or BTRFS, than snapshots are atomic, essentially instant, and can even be sent over the network as diffs/deltas against the last snapshot that was sent, so backing up a TB drive over the network every hour is total reasonable. These filesystems also give you access to snapshots via the FS, so you don't need to restore a snapshot to access your files, it's as easy mounting the readonly-snapshots directory, which you could even just keep mounted 24/7, making deleted files just a `cd` away, or quicker with a shell script that jumps first into the last snapshot of your current directory and then recursively back to the last one before that.

Not that I've gotten around to writing that script for myself :c

But the automatic snapshots are much easier c:

forgotmypw174y ago

> I don't need to backup my Discord

If you don't back up your Discord, what will you do when Discord changes how it works or stops working?

ozim4y ago

Important part of Discord are people - having backup means of communication on different platforms is best I think.

kroltan4y ago

The application is self-modifying and partially server-defined, as well as being an online communication tool. Discord-the-company can change it at basically any time, and disallow any versions deemed "inappropriate".

pixl974y ago

>my excel spreadsheets or invoices / documents

Easy when you work on simple documents. I wouldn't want my accountant doing that.

throw74y ago· 5 in thread

Wouldn't you need to quiesce the db or application before the snapshot?

jjnoakes4y ago

Not usually. A well-written application that cares about your data should be written in a way to survive sudden power loss, and to such an application, a file-system-level snapshot taken at an arbitrary point in time looks basically the same as sudden power loss.

Now, that being said, if you have the ability to set up backups where you can minimize the file system activity, you might be slightly better off doing it, but the ROI is probably fairly low unless it's extremely trivial to set that up for everything that's running.

compsciphd4y ago

that's not really true. there's a reason VSS exists on windows.

1 more reply

blibble4y ago

yep, otherwise you risk some state being in memory and not on the disk

topspin4y ago

Your answer is too definitive. Applications and databases can indeed suffer power loss while losing nothing of value. An ACID database used by a correctly designed application will lose nothing on power loss; state "in memory" is uncommitted in such a system effectively doesn't exist. It never happened and, furthermore, no one cares.

The term of art is "crash consistent," and any ACID database must preserve all committed state across events such as power loss. Such a database is correctly backed up when copying a simultaneous point in time snapshot across all involved volumes.

Not all databases are truly ACID. Lots of software relies on uncommitted database state. But we're talking about a solved problem here; if you require ACID behavior the means to achieve exactly that are available. Any exception to that statement, including hardware misfeatures or lack of two phase commit across databases, is equivalent to "incorrectly designed."

In a correctly designed system quiescing the database isn't necessary, but might still be used as a precaution or a performance optimization.

1 more reply

SoftTalker4y ago

Exactly. A snapshot is a point in time on the storage media. It allows you to make a backup as if the backup occurred at that point in time, i.e. files are not changing while the backup is running. A snapshot is not a guarantee that any particular file is consistent with the state of the application that is using it. It's still a backup of an "open" file with all the potential issues that implies.

compsciphd4y ago· 4 in thread

even with file system snapshots, your backup can be corrupt.

the example of the browser profile is a classic case.

imagine multiple writes occur and all of them have to occur for the profile to be correct (either multiple files are being written or firefox will write multiple blocks to the file). if the snapshot operation occurs in the middle, then the snapshot will be "corrupt".

I believe this is actually the entire point of Windows' volume shadow service (which is sort of poo-pooed in the article), to enable applications to tell the snapshot mechanism "wait, I'm in the middle of a file system transaction" and then to pause writes until the snapshot operation occurs after they finish the in process transaction.

without such a mechanism, you are always going to be at risk with snapshots.

In https://www.cs.columbia.edu/~nieh/pubs/sosp2007_dejaview.pdf we avoided this problem by combining 2 mechanisms without having modifying applications with such a service

1) we used a log structured file system (that was inherently a snapshot, every log entry was individually mountable) and 2) we used a checkpoint/restart mechanism that saved process state and enabled us to restart the processes combined with the file system state as it was at checkpoint time. (Checkpoint would also sync all dirty pages to disk and that fs state after the sync is what we tied to the checkpoint state).

So when a process would be resumed, the file system would look exactly as the process expected it, even if the process was in the middle of what can be referred to a transaction. But that only worked because the processes were restored along with file system, if we only restored the file system, it could be inconsistent.

grdomzal4y ago

> I believe this is actually the entire point of Windows' volume shadow service (which is sort of poo-pooed in the article), to enable applications to tell the snapshot mechanism "wait, I'm in the middle of a file system transaction" and then to pause writes until the snapshot operation occurs after they finish the in process transaction.

Former maintainer of VSS here. Yes, that's exactly right. In fact, filesystem snapshots are usually not enough for true application level consistency. As others have noted elsewhere, a filesystem snapshot is the equivalent to yanking the power cord out of the back of your computer. It's good, assuming your filesystem does atomic writes / copy-on-write / write-to-new. But we can do even better.

Imagine I have two databases - one traditional relational DB storing my app content, and a second log database. You want to keep these in sync. Well, good luck doing this with the filesystem alone. Usually your DBMS will need to be involved as well, and this is where the VSS "writer" concept comes in. When a snapshot is being taken, applications such as SQL will be invited to participate. Typically, this means they'll start to hold up writes so that things will be quieter for the snapshot. But they will also have a chance, after the fact, and to actually clean up the snapshot itself. In this case, the DBMS could roll back the log database to then match the content database.

It's correct that NTFS doesn't support snapshots natively, but Windows has the volsnap.sys driver that takes care of it. For all intents and purposes, NTFS does support copy-on-write snapshots.

Complicated? Sure. But it was quite capable, and actually a pretty cool (but sadly under appreciated) piece of technology.

cyounkinsOP4y ago

Thanks for writing this! Am I correct in understanding that VSS requires writer support and that support is not widespread in desktop applications, eg web browsers?

Twirrim4y ago

This article also fails to consider that on modern systems writes are often batched and not immediate. The only way to get an actually consistent and safe backup of any file / file system, is to actually shut down the machine so that all applications, and the OS, have shut down gracefully.

As an intermediate step, it may be practical to consider shutting down your application, flush all writes (e.g. via sync command), and taking the file system snapshot then, restarting the application, and then taking the backup from that file system snapshot. At least then your critical data should have a consistent and safe backup, even if the rest of the OS _may_ be in a suspect state.

AshamedCaptain4y ago

The worst part is that the article itself says this. This is exactly like "pulling the power on the running system" and then imaging the hard disk. Even if it happens to work, that is not really a good idea.

bretpiatt4y ago· 4 in thread

A system snapshot for a personal computer is potentially a really difficult thing to restore (ex. you break the device and need a new one, unless it matches exactly restoring snapshot likely puts you in OS recovery mode).

A volume snapshot if you're technical enough to split up OS and application data volumes on your local drive can potentially help but you still have registry or other issues.

The best way to handle backups, from the perspective of a backup company CEO, is to backup your critical data, and that could include application configurations.

Browsers support exporting your profile, good backup software let's you run a pre-backup job script and post-job script, so you can export your browser profile, lock certain files or folders during a backup, etc. to get a good recovery point while minimizing risk of data corruption.

Way longer topic than a comment here. If you do want full system state, pick a hypervisor (Parallels, etc.) and then backup the full VM each time you power the VM down. It'll make for much larger backups so restores are slower and storing all the recovery points will cost you a bit more too.

jiggawatts4y ago

This is 100% false and nobody should ever follow this advice.

I use snapshots all the time because Windows uses them automatically to get complete and consistent backups.

Snapshots are always better then no snapshots.

Full backups are always safer than partial backups.

It takes just ONE forgotten path or file to make a backup completely impossible to restore.

Don’t be a smart ass with backups. Just don’t.

bretpiatt4y ago

A full system snapshot is not 100% always the correct solution, data protection is complicated.

An example hypothetical, you have deeply embedded malware and you're unsure when it infected the system, it could have been sitting latent for months. In this situation you have to restore back and lose months of application data or restore a recent snapshot and then try to remove the malware in a clean room. It is much safer and easier to start from a clean OS and then restoring only the data on the system which is much simpler to scan for infection and sanitize.

iggldiggl4y ago

> A system snapshot for a personal computer is potentially a really difficult thing to restore (ex. you break the device and need a new one, unless it matches exactly restoring snapshot likely puts you in OS recovery mode).

While I am sure the process has limitations, I think that modern Windows has some amount of flexibility there, because due to a defect I've recently had to swap out my mainboard + CPU, and the system booted up just fine. The most major issue was that with the default drivers Windows had reverted to, the Ethernet controller didn't work, which could have been a little bit of a Catch-22 (no internet without the correct drivers, and no correct drivers without internet), but in practice I just downloaded the drivers on a different device and transferred them over via USB.

bretpiatt4y ago

Yeah, plug and play drivers are much better today than years ago, they're also still challenging in many situations.

For someone with your technical skill level it's very manageable, most folks don't know what a mainboard is.

To an average user it's much easier to work with restoring their My Documents (and if they need full system state have Parallels save VMs under that folder path).

ReactiveJelly4y ago· 2 in thread

Completely true

But also, if you have no backups, incorrect ragged backups of only non-changing files are a _hell_ of a lot better than no backups at all.

And if you get in the habit of doing incremental backups every day, you might be lucky enough to find a non-corrupted version of a file that's a few days stale.

drewzero14y ago

For a long time I didn't know where to start with backups, because I was convinced that incomplete or unreliable backups were a total waste of time and money. (Which tbf isn't 100% false.)

Eventually I realized that some backups are better than no backups at all, and prioritized getting _something_ in place to at the very least have copies of important files copied regularly to another drive somewhere.

michaelt4y ago

> I was convinced that incomplete or unreliable backups were a total waste of time and money. (Which tbf isn't 100% false.)

It depends a lot on your use case.

If you're backing up a production database system that's in constant use, then the files are almost certain to change while the backup is in progress. And a backup of everything on the database server except the data probably isn't what you want!

On the other hand, if I'm backing up my personal computer, my family photos aren't getting regularly overwritten - and if the backup of my web browser cache is inconsistent? So be it.

2 more replies

Havoc4y ago· 2 in thread

>On Linux I strongly recommend ZFS due to its long history of reliability

I've not been seeing any of this much vaunted reliability. 3 out of 3 of the last drives I lost data on were ZFS. Meanwhile NTFS and EXT4 has been fairly trouble free.

Though to be fair hard to tell how much of that is TrueNAS vs ZFS.

XorNot4y ago

This seems very unlikely without a lot of contributing factors. Unless you weren't running any RAID in which case ZFS didn't lose you data but it would've been quite upfront about corruption when it happened.

Havoc4y ago

>This seems very unlikely

That's what I thought given that it does have a good rep.

Haven't given up on it yet, but definitely less trusting - 3 different drives in 3 different ways wasn't expected.

That said - all consumer class gear in janky environments so there may very well be other factors lurking.

1 more reply

aborsy4y ago· 1 in thread

Ubuntu supports ZFS with one click at installation time. You can snapshot your system every 15 minutes or as you desire.

ComputerGuru4y ago

Note that it’s not really a supported option and was supposed to be removed from the installer for the latest release but that was cancelled because there wasn’t enough time to do so before the freeze. It’s certain to be dropped in the next release.

So by all means, play with it but don’t consider it to be stable.

https://bugs.launchpad.net/ubuntu/+source/ubiquity/+bug/1966...

https://code.launchpad.net/~ubuntu-installer/ubiquity/+git/u...

Edit: fixed link.

notacoward4y ago· 1 in thread

Should clarify that OP is talking about a consumer use case. In the enterprise space, quiescing applications and databases during a snapshot was standard at least as far back as 1992 when I started working on HA software which would (among its other functions) automate the process. For quite a few years after that, I worked on systems where block-level snapshots were more common (and performed better) than filesystem-level. Filesystem snapshots are great, and it's also great to warn people about the dangers of doing backups without some sort of protection against in-flight changes, but the title as written isn't quite correct.

viraptor4y ago

For quite a few years after 1992 we didn't really have simple solutions that could do atomic snapshots. I'm not sure about specialised enterprise cases, but for Linux consumers, the first one would be lvm right? (And that was not write-barrier-safe until 2010)

mst4y ago

Given ZFS it's potentially worth looking at https://github.com/jimsalterjrs/sanoid

dboreham4y ago

Note the article seems to be considering desktop/laptop systems. If you're backing up a server running a database you have another whole set of problems, but generally you won't need PIT snapshot semantics from the filesystem. You get that from the database's storage engine (e.g. via write ahead logging). Backup in that context is done in cahoots with the storage engine such that a consistent-after-recovery set of filesystem data is backed up. Databases such as Oracle, SQLServer and PostgreSQL existed long before filesystems with consistent snapshot capability were common.

Freaky4y ago

I wrote https://github.com/Freaky/zfsnapr a few months ago so I could finally have point-in-time consistent Borg backups with ZFS snapshots, without having the mess of teaching Borg where every .zfs directory was.

It recursively snapshots mounted pools, and recursively mounts snapshots of the mounted datasets into a target ready to point your backup tools at. I do so via a chroot so I didn't need to make any changes to my Borg setup - just to how I run it.

fudgy4y ago

https://www.arqbackup.com/ uses snapshots both on macOS and Windows!

nix0n4y ago

The other option is downtime: boot into another OS, so you can mount the disk to be backed up as read-only.

NewEntryHN4y ago

Nobody cares about correct backups. Backup your pictures and payslips and whatever paper you're working on.

j / k navigate · click thread line to collapse

74 comments

49 comments · 15 top-level

liuliu4y ago· 8 in thread

Actually, I have a hard time to understand this. While snapshot reduces file corruptions, it is not a guarantee the best I understand.

Thus, any discussion on file corruptions without a file system that supports transaction, requires every application to use a competent database underneath (SQLite) at least.

And even with transactional support in a file system or using a database, you need every application to have the correct business logic that does the transaction correctly as well.

All-in-all, correct backup cannot be solved universally without knowing all the applications. The best we can do is to probabilistically avoid obvious issues, i.e. using FS-level snapshot.

kazinator4y ago

Guess which of these situations applications are much more likely to be able to recover from (if any at all)?

When people write crash recovery code, they typically assume that the world simply stopped, not that it stopped, and then some data was randomly rewound to unspecified older states.

I don't think that you can reasonably defend against data that is restored from a backup, where the oldest part of the backup is an hour older than the newest.

liuliu4y ago

I think we need to be a bit more concrete than analogies for my brain to process. So here we go:

kazinator4y ago

I think it's not just between files, but within a file.

If the application structures its scattered write transactions such that they write in increasing offset order, then maybe it's okay.

2 more replies

cyounkinsOP4y ago

adrianmonk4y ago

kazinator4y ago

Right! But not necessarily a file by file backup.

a13692099934y ago

> While snapshot reduces file corruptions, it is not a guarantee the best I understand.

  read the first half of the file
  context switch to the application
  application writes to the first half of the file
  application fsyncs previous writes
  application writes to the second half of the file
  context switch back to you
  read the second half of the file

and end up with data in the second half of the file that the application normally only writes after it's sure that corresponding data has been written to the first half.

rzzzt4y ago

ozim4y ago· 7 in thread

Most of what is described there seems like "backup larping".

I don't need to backup my Discord and most likely I will be able to simply restore any iTerm configuration that I have in less time than setting up and keeping file system snapshots running.

Restoring file system snapshot would usually be much more hassle than filling in single invoice again or downloading it from some provider again.

cyounkinsOP4y ago

Sure, I've been accused of being overly concerned with correctness. :-)

> because imagine doing full snapshot of 1 Terabyte drive it takes I suppose at least an hour anyway.

On copy-on-write filesystems, snapshots are nearly instantaneous.

> time than setting up and keeping file system snapshots running

As shown in the code snippet, pretty much all it takes is a few snapshot commands around the backup command and changing the source directory to the snapshot mount point.

ozim4y ago

Different people care about different things :)

I also think about snapshots differently.

For me snapshot is actual copy stored on a different hard drive or other medium. So copy operation is never going to be instant.

2 more replies

LanternLight834y ago

Not that I've gotten around to writing that script for myself :c

But the automatic snapshots are much easier c:

forgotmypw174y ago

> I don't need to backup my Discord

If you don't back up your Discord, what will you do when Discord changes how it works or stops working?

ozim4y ago

Important part of Discord are people - having backup means of communication on different platforms is best I think.

kroltan4y ago

pixl974y ago

>my excel spreadsheets or invoices / documents

Easy when you work on simple documents. I wouldn't want my accountant doing that.

throw74y ago· 5 in thread

Wouldn't you need to quiesce the db or application before the snapshot?

jjnoakes4y ago

compsciphd4y ago

that's not really true. there's a reason VSS exists on windows.

1 more reply

blibble4y ago

yep, otherwise you risk some state being in memory and not on the disk

topspin4y ago

In a correctly designed system quiescing the database isn't necessary, but might still be used as a precaution or a performance optimization.

1 more reply

SoftTalker4y ago

compsciphd4y ago· 4 in thread

even with file system snapshots, your backup can be corrupt.

the example of the browser profile is a classic case.

without such a mechanism, you are always going to be at risk with snapshots.

In https://www.cs.columbia.edu/~nieh/pubs/sosp2007_dejaview.pdf we avoided this problem by combining 2 mechanisms without having modifying applications with such a service

grdomzal4y ago

It's correct that NTFS doesn't support snapshots natively, but Windows has the volsnap.sys driver that takes care of it. For all intents and purposes, NTFS does support copy-on-write snapshots.

Complicated? Sure. But it was quite capable, and actually a pretty cool (but sadly under appreciated) piece of technology.

cyounkinsOP4y ago

Thanks for writing this! Am I correct in understanding that VSS requires writer support and that support is not widespread in desktop applications, eg web browsers?

Twirrim4y ago

AshamedCaptain4y ago

bretpiatt4y ago· 4 in thread

A volume snapshot if you're technical enough to split up OS and application data volumes on your local drive can potentially help but you still have registry or other issues.

The best way to handle backups, from the perspective of a backup company CEO, is to backup your critical data, and that could include application configurations.

jiggawatts4y ago

This is 100% false and nobody should ever follow this advice.

I use snapshots all the time because Windows uses them automatically to get complete and consistent backups.

Snapshots are always better then no snapshots.

Full backups are always safer than partial backups.

It takes just ONE forgotten path or file to make a backup completely impossible to restore.

Don’t be a smart ass with backups. Just don’t.

bretpiatt4y ago

A full system snapshot is not 100% always the correct solution, data protection is complicated.

iggldiggl4y ago

bretpiatt4y ago

Yeah, plug and play drivers are much better today than years ago, they're also still challenging in many situations.

For someone with your technical skill level it's very manageable, most folks don't know what a mainboard is.

To an average user it's much easier to work with restoring their My Documents (and if they need full system state have Parallels save VMs under that folder path).

ReactiveJelly4y ago· 2 in thread

Completely true

But also, if you have no backups, incorrect ragged backups of only non-changing files are a _hell_ of a lot better than no backups at all.

And if you get in the habit of doing incremental backups every day, you might be lucky enough to find a non-corrupted version of a file that's a few days stale.

drewzero14y ago

For a long time I didn't know where to start with backups, because I was convinced that incomplete or unreliable backups were a total waste of time and money. (Which tbf isn't 100% false.)

michaelt4y ago

> I was convinced that incomplete or unreliable backups were a total waste of time and money. (Which tbf isn't 100% false.)

It depends a lot on your use case.

On the other hand, if I'm backing up my personal computer, my family photos aren't getting regularly overwritten - and if the backup of my web browser cache is inconsistent? So be it.

2 more replies

Havoc4y ago· 2 in thread

>On Linux I strongly recommend ZFS due to its long history of reliability

I've not been seeing any of this much vaunted reliability. 3 out of 3 of the last drives I lost data on were ZFS. Meanwhile NTFS and EXT4 has been fairly trouble free.

Though to be fair hard to tell how much of that is TrueNAS vs ZFS.

XorNot4y ago

Havoc4y ago

>This seems very unlikely

That's what I thought given that it does have a good rep.

Haven't given up on it yet, but definitely less trusting - 3 different drives in 3 different ways wasn't expected.

That said - all consumer class gear in janky environments so there may very well be other factors lurking.

1 more reply

aborsy4y ago· 1 in thread

Ubuntu supports ZFS with one click at installation time. You can snapshot your system every 15 minutes or as you desire.

ComputerGuru4y ago

So by all means, play with it but don’t consider it to be stable.

https://bugs.launchpad.net/ubuntu/+source/ubiquity/+bug/1966...

https://code.launchpad.net/~ubuntu-installer/ubiquity/+git/u...

Edit: fixed link.

notacoward4y ago· 1 in thread

viraptor4y ago

mst4y ago

Given ZFS it's potentially worth looking at https://github.com/jimsalterjrs/sanoid

dboreham4y ago

Freaky4y ago

fudgy4y ago

https://www.arqbackup.com/ uses snapshots both on macOS and Windows!

nix0n4y ago

The other option is downtime: boot into another OS, so you can mount the disk to be backed up as read-only.

NewEntryHN4y ago

Nobody cares about correct backups. Backup your pictures and payslips and whatever paper you're working on.

j / k navigate · click thread line to collapse