undefined | Better HN

0 pointsmacintux3y ago0 comments

Or, the redundancy actually causes a failure, so not only have you spent more money but you’ve reduced your availability doing so.

(Or worse, the redundancy causes a subtle failure like data loss.)

0 comments

2 comments · 1 top-level

martinald3y ago· 1 in thread

Nail on the head. The amount of times I've seen way overcomplicated redundancy setups which fail in weird and wonderful ways, causing way more downtime than just a simplier setup is pretty silly.

bad416f1f5a23y ago

Don’t make the mistake of overromanticizing the simple solutions. They have nice, well understood failure conditions, and they come up relatively frequently.

When you start playing the HA game, the easy failures go off the table, and things break less often because “failures happen constantly and are auto-healed”. But when your virtual IP failover goes sideways or your cluster scheduler starts reaping systems because the metadata service is giving it useless data, you’re well into an infrequent, complex failure, and I hope you have a good ops team.

It’s always a trade off.

j / k navigate · click thread line to collapse