undefined | Better HN

0 pointskalleth4y ago0 comments

I'd be surprised if they needed backups for a few hours of downtime with (reportedly) complete recovery where no data was corrupted. There are industries where this would be required, and it's possible I guess, but neither of these downtime events were "data loss" events, just availability events for short-ish periods of time that wouldn't - for me - result in activating our DR plans.

I must admit that I do always try and maintain a separate data backup for true disaster recovery scenarios - but those are mainly focused around AWS locking me out of our AWS account (and hence we can't access our data or backups) or recovering from a crypto scam hack that also corrupts on-platform backups, for example.

0 comments

aeonflux4y ago

I once had to argue that we still do need backup even though S3 has redundancy. They laughed when I mentioned a possible lock-up from AWS (even due to a mistake or whatever). I asked what if we delete data from app by mistake? They told me we need to be careful not to do that. I guess I am getting more and more tired of arrogant 25 years old programmers with 1-2 years in industry and no experience.

swid4y ago

One thing you should absolutely not count on, but might be a course of actions for large clients, is to contact support and ask them to restore accidentally / maliciously deleted files.

I would never use this as part of the backup and restore plan; but I was lucky when a bunch of customer files were deleted due to a bug in a release. Something like 100k files were deleted from Google Storage without us having backup. In a panic we contact GCP. We were able to provide a list of all the file names from our logs. In the end, all but 6 files were recovered.

I think it took around 2-3 days to get all the files restored, which was still a big headache and impactful to people.

gime_tree_fiddy4y ago

This is not a reliable mechanism btw. There will be times when they won't be able to restore the data for you. Their product has options to avoid this situation like object versioning.

manquer4y ago

S3 and (others) have version history that can be enabled.

If you have to take care of availablity and redundancy and delete protection and backups then why pay the premium S3 is charging ?

Either you don't trust the cloud and you can run NAS or equivalent (with s3 APIs easily today) much cheaper or trust them to keep your data safe and available.

No point in investing in S3 and then doing it again yourself.

ncallaway4y ago

> No point in investing in S3 and then doing it again yourself.

I mean that's just obviously wrong, though.

There is a point.

> Either you don't trust the cloud and you can run NAS or equivalent (with s3 APIs easily today) much cheaper or trust them to keep your data safe and available.

What if you trust the cloud 90%, and you trust yourself 90%, and you think it's likely that the failure cases between the two are likely to be independent? Then it seems like the smart decision would be to do both.

Your position is basically arguing that redundant systems are never necessary, because "either you trust A or you trust B, why do both?" If it's absolutely critical that you don't suffer a particular failure, then having redundant systems is very wise.

2 more replies

kallethOP4y ago

In most startups? You're mostly correct.

But you still have some risks here, yes, with a super low probability, but a company-killing impact.

In some industries - banking, finance, anything regulated, or really (I'd argue) anywhere where losing all of your data is company killing - you will need a disaster recovery strategy in place.

The risks requiring non-AWS backups are things like:

- A failed payment goes unnoticed and AWS locks us out of your AWS account, which also goes unnoticed and the account and data are deleted

- A bad actor gains access to the root account through faxing Amazon a fake notarized letter, finding a leaked AWS key, social engineering one of your DevOps team, and encrypts all of your data while removing your AWS-based backups

- An internal bad actor deletes all of your AWS data because they know they're about to be fired

...and so on.

There's so many scenarios that aren't technical which can result in a single vendor dependency for your entire business being unwise.

A storage array in a separate DC somewhere where your platform can send (and only send! not access or modify) backups of your business critical data ticks off those super low probability but company-killing impact risks.

This is why risk matrices have separate probability and impact sections. Miniscule probability but "the company directors go to jail" impact? Better believe I'm spending some time on that.

2 more replies

jeremyjh4y ago

There are completely independent risks that you are dealing with here. If you are a small company there is a non-insignificant risk that your cloud account will be closed and it will be impossible to find out why or to fix it in a timely matter. There have been several that were only fixed after being escalated to the front page of Hacker News, and we haven't heard about the ones that didn't get enough upvotes to get our attention and were never fixed.

Also, what we saw on Dec 7th was that the complexity of Amazon's infrastructure introduces risks of downtime that simply cannot be fully mitigated by Amazon, or by any other single provider. More redundancy introduces more complexity at both the micro level and macro level.

It doesn't really cost that much to at least store replicated data in an independent cloud, particularly a low-cost one like Digital Ocean.

scurvy4y ago

Backup on site and store tertiary copies in a cloud. Storing all backups in AWS wouldn't meet a lot of compliance requirements. Even multiple AZs in AWS would not pass muster as there are single points of failure (API, auth, etc).

hinkley4y ago

Whether you realize it or not, you believe in the Scapegoat Effect, and it's going to get you into a shitload of trouble some day.

Customers don't care if it's you're fault or not, they only care that your stuff is broken. That safety blanket of having a vendor to blame for the problem might feel like it'll protect your job but the fact is that there are many points in your career where there is one customer we can't afford to lose for financial or political reasons, and if your lack of pessimistic thinking loses us that customer, then you're boned. You might not be fired, but you'll be at the top of the list for a layoff round (and if the loss was financial, that'll happen).

In IT, we pay someone else to clean our offices and restock supplies because it's not part of our core business. It's fine to let that go. If I work at a hotel or a restaurant, though, 'we' have our own people that clean the buildings and equipment. Because a hotel is a clean, dry building that people rent in increments of 24 hours. Similarly, a restaurant has to build up a core competency in cleanliness or the health department will shut them down. If we violate that social contract, we take it in the teeth, and then people legislate away our opportunities to cut those corners.

For the life of me I can't figure out why IT companies are running to AWS. This is the exact same sort of facilities management problem that physical businesses deal with internally.

I have saved myself and my teams from a few architectural blunders by asking the head of IT or Operations what they think of my solution. Sometimes the answer starts with, "nobody would ever deploy a solution that looked like that". Better to get that feedback in private rather than in a post-mortem or via veto in a launch meeting. But I have had less and less access to that sort of domain knowledge over the last decade, between Cloud Services and centralized, faceless IT at some bigger companies. It's a huge loss of wisdom, and I don't know that the consequences are entirely outweighed by the advantages.

b1124y ago

Erm.

In some orgs, recreating lost data, code, deployment and more is literally hundreds of thousands of hours of work.

In a smaller org, the devastation can be just as stark. Loosing hundreds of hours of work can be a death knell.

Anyone advocating placing an entire orgs's future on one provider is literally, completely incompetent.

It's the equiv of a home user thinking all their baby pics will be safe on google or facebook. It is just plain dumb.

hayd4y ago

Having an additional AWS account which some S3 backs up to, with write only permissions (no delete) and in an account that is not used by anyone, seems like a good idea for this type of situation/concern.

tonto4y ago

I had this experience when I asked about s3 backup also (after a junior programmer deleted a directory in our s3 bucket...). The response from r/aws was "just don't let that happen" or ("use IAM roles")

AceyMan4y ago

411, in the latest reInvent AWS announced preview of AWS Backup for S3 (right now in USW2 only).

Relevant blog post, https://aws.amazon.com/blogs/aws/preview-aws-backup-adds-sup...

sidpatil4y ago

> I asked what if we delete data from app by mistake? They told me we need to be careful not to do that.

Ah, the Vinnie Boombatz treatment.

smiths19994y ago

Maybe they are getting tired of arrogant older programmers assuming they cannot possibly be wrong. God forbid a 25 year old might actually have a good idea (and I am far removed from my 20s).

Maybe having S3 redundancy wasn't the most important thing to be tackled? Does your company really need that complexity? Are you so big and such an important service that you cannot possibly risk going down or losing data?

ramraj074y ago

You really chose to die on “backups are for old people” as a hill?

1 more reply

exdsq4y ago

I’d love to know what someone works on when the risk of losing data is not worth one or two days engineering work.

lostcolony4y ago

But that's just it; you can't even have that discussion if the response to "hey, should we be backing up beyond S3 redundancy?" is "No. Why would we? S3 is infallible"

1 more reply

FpUser4y ago

>"Maybe they are getting tired of arrogant older programmers..."

And this is of course valid reason to ignore basic data preservation approaches.

Myself I am an old fart and I realize that I am too independent / cautious. But I see way too many young programmers who just read sales pitch and honestly believe that once data is on Amazon/Azure/Google it is automatically safe, their apps are automatically scalable, etc. etc.

1 more reply

javagram4y ago

Sounds like the kind of short-term thinking that leads to companies being completely wiped out by ransomware. Who needs backups anyway?

oblio4y ago

It's not a lot of complexity.

Add object versioning for your bucket (1 click) and mirror/sync your bucket to another bucket (a few more clicks).

Yes, your S3 costs will double, but usually they're peanuts compared to all the other costs, anyway.

Debating it takes longer than configuring it.

1 more reply

wly_cdgr4y ago

Would you put the one and only copy of your family photo album up on AWS, where AWS going down would mean losing it? Because your customers' data is more important than that

1 more reply

Beltiras4y ago

Losing data usually means losing customers. Usually more customers than just whos data you lost.

1 more reply

silon424y ago

Next time also mention that it might be a problem to get a consistent back of microservices...

hinkley4y ago

AWS has had at least one documented incident where a region had an S3 failure that was not recoverable. They lost about 2% of all data. That might not sound like much but if you have a lot of data, partial restoration of that data doesn't necessarily leave your system in a functional state. If it loses my compiled CSS files I might be able to redeploy my app to fix it. Then again if I'm a SaaS company and that file was generated in part from user input, it might be more difficult to reconstruct that data.

Johnny5554y ago

Which incident is this? I can’t find it online. The closest I can recall is when they lost some number of EBS volumes. We were affected by that, but ran snapshots (to s3) to recover the affected servers.

newobj4y ago

Sorry, when was this? Please provide a citation.

j / k navigate · click thread line to collapse

0 comments

aeonflux4y ago

swid4y ago

One thing you should absolutely not count on, but might be a course of actions for large clients, is to contact support and ask them to restore accidentally / maliciously deleted files.

I think it took around 2-3 days to get all the files restored, which was still a big headache and impactful to people.

gime_tree_fiddy4y ago

This is not a reliable mechanism btw. There will be times when they won't be able to restore the data for you. Their product has options to avoid this situation like object versioning.

manquer4y ago

S3 and (others) have version history that can be enabled.

If you have to take care of availablity and redundancy and delete protection and backups then why pay the premium S3 is charging ?

Either you don't trust the cloud and you can run NAS or equivalent (with s3 APIs easily today) much cheaper or trust them to keep your data safe and available.

No point in investing in S3 and then doing it again yourself.

ncallaway4y ago

> No point in investing in S3 and then doing it again yourself.

I mean that's just obviously wrong, though.

There is a point.

> Either you don't trust the cloud and you can run NAS or equivalent (with s3 APIs easily today) much cheaper or trust them to keep your data safe and available.

2 more replies

kallethOP4y ago

In most startups? You're mostly correct.

But you still have some risks here, yes, with a super low probability, but a company-killing impact.

In some industries - banking, finance, anything regulated, or really (I'd argue) anywhere where losing all of your data is company killing - you will need a disaster recovery strategy in place.

The risks requiring non-AWS backups are things like:

- A failed payment goes unnoticed and AWS locks us out of your AWS account, which also goes unnoticed and the account and data are deleted

- An internal bad actor deletes all of your AWS data because they know they're about to be fired

...and so on.

There's so many scenarios that aren't technical which can result in a single vendor dependency for your entire business being unwise.

This is why risk matrices have separate probability and impact sections. Miniscule probability but "the company directors go to jail" impact? Better believe I'm spending some time on that.

2 more replies

jeremyjh4y ago

It doesn't really cost that much to at least store replicated data in an independent cloud, particularly a low-cost one like Digital Ocean.

scurvy4y ago

hinkley4y ago

Whether you realize it or not, you believe in the Scapegoat Effect, and it's going to get you into a shitload of trouble some day.

For the life of me I can't figure out why IT companies are running to AWS. This is the exact same sort of facilities management problem that physical businesses deal with internally.

b1124y ago

Erm.

In some orgs, recreating lost data, code, deployment and more is literally hundreds of thousands of hours of work.

In a smaller org, the devastation can be just as stark. Loosing hundreds of hours of work can be a death knell.

Anyone advocating placing an entire orgs's future on one provider is literally, completely incompetent.

It's the equiv of a home user thinking all their baby pics will be safe on google or facebook. It is just plain dumb.

hayd4y ago

tonto4y ago

AceyMan4y ago

411, in the latest reInvent AWS announced preview of AWS Backup for S3 (right now in USW2 only).

Relevant blog post, https://aws.amazon.com/blogs/aws/preview-aws-backup-adds-sup...

sidpatil4y ago

> I asked what if we delete data from app by mistake? They told me we need to be careful not to do that.

Ah, the Vinnie Boombatz treatment.

smiths19994y ago

Maybe they are getting tired of arrogant older programmers assuming they cannot possibly be wrong. God forbid a 25 year old might actually have a good idea (and I am far removed from my 20s).

ramraj074y ago

You really chose to die on “backups are for old people” as a hill?

1 more reply

exdsq4y ago

I’d love to know what someone works on when the risk of losing data is not worth one or two days engineering work.

lostcolony4y ago

But that's just it; you can't even have that discussion if the response to "hey, should we be backing up beyond S3 redundancy?" is "No. Why would we? S3 is infallible"

1 more reply

FpUser4y ago

>"Maybe they are getting tired of arrogant older programmers..."

And this is of course valid reason to ignore basic data preservation approaches.

1 more reply

javagram4y ago

Sounds like the kind of short-term thinking that leads to companies being completely wiped out by ransomware. Who needs backups anyway?

oblio4y ago

It's not a lot of complexity.

Add object versioning for your bucket (1 click) and mirror/sync your bucket to another bucket (a few more clicks).

Yes, your S3 costs will double, but usually they're peanuts compared to all the other costs, anyway.

Debating it takes longer than configuring it.

1 more reply

wly_cdgr4y ago

Would you put the one and only copy of your family photo album up on AWS, where AWS going down would mean losing it? Because your customers' data is more important than that

1 more reply

Beltiras4y ago

Losing data usually means losing customers. Usually more customers than just whos data you lost.

1 more reply

silon424y ago

Next time also mention that it might be a problem to get a consistent back of microservices...

hinkley4y ago

Johnny5554y ago

newobj4y ago

Sorry, when was this? Please provide a citation.

j / k navigate · click thread line to collapse