It seems reasonable to start worrying about the fragility potentially introduced by these massive internet infrastructure companies.
The US has no terrorism problem. In 1979, the Irish killed the Queen's uncle in law, and in 1984, they blew up a hotel where Thatcher was staying. That's a serious terrorism problem! What the US has is nothing by comparison. We were unlucky on 9-11, and ever since we've been distorting our foreign policy out of unreasonable fear.
There are a LOT of far softer targets that go unprotected. A terrorist attack on a sewage plant for a major city would be far more devastating than knocking out a few websites.
If you wanted that type of destruction, and be noticed, you would need a city leveler type event in a data center heavy area.
That should be the priority.
Realisitically, this has not been the solution implmented (in the EU & US, at least). In the EU, it is even more crucial as the "solutions" to this problem are applied to state finances as well as financial institutions.
In terms of policies, there are two competing approaches: (1) Reduce the size of "too-big-to-fail" institutions. (2) Regulate them more heavily (or some other strategy) so that they will not fail. In the EU, this is being applied to states, not just financial institutions. Rules that (supposedly) reduce catastrophic risk.
Almost all seripous policy proposals are in the no. 2 category. Tighten regulation, reduce the risk of failure. Tighter regulation lends to stronger incumbents and larger average company size so by doing 2, you are probably doing the opposite of 1.
As I said, I don't know what Bernie's proposal is or how mature it is as a policy (as opposed to a politician statement). It would be notable if a left wing politician propsed loosening bank regulations, though definitely not impossible or unreasonable.
That seems like the most reasonable response. And yet, since the great recession, our policy has been "make 'too big to fail' even bigger".
The problem is that the banks have become too powerful for anyone to challenge. A Teddy Roosevelt type of political leader can't exist today.
I'd say that most industries do not have anyone responsible for worrying about it high enough in the management chain.
(Apart from the result of a botched patching or update to the core software stack that was done worldwide at the same time and hopefully never happens).
Also, deployments are designed to be exponential and no region should ever have a cross region dependency.
It looks very separated on the outside, but I've worked in so many companies that have appeared incredibly competent externally but have "snowflake" servers which keep things ticking over- Given Bezos treatment of workers I have absolutely no confidence that everything is as cleanly engineered as they claim.
[1]: https://news.ycombinator.com/item?id=12392081
[2]: https://minio.io/
I don't understand the preference for AWS over open source in many cases. Their services are "reliable", but they often have minute restrictions that will eventually bite you. You also end up having to pay for something you could get for free. Why use SNS/SQS when there are free pubsub/message buses out there? Most of the other devs justify this with the argument of not having to maintain the software themselves. "But RabbitMQ might crash! We don't have to worry about that with AWS!"
Anyway, I typically minimize the AWS services I use (S3, EC2, ECS) so I don't dread the day AWS blows up or, more likely, some VP or exec says we're moving to GCP/Azure because we got a better deal.
Free is never really free. There's always a tradeoff in engineering time and money when you choose to run your own stack instead of paying to use a stable, well-established service. Oftentimes running your own will be cheaper overall, but you have to do that cost-benefit comparison for yourself.
I can confirm that not only can RabbitMQ get into an unusable state, it will do so extremely rapidly and with little warning unless you sit an engineer or two on it to monitor and manage the incoming/dead letter rates.
- S3 has a public protocol and many 3rd party providers support it (OpenIO, Scality, Ceph, Minio, etc),
- EFS could be replaced with something like DRDB or GlusterFS, or DigitalOcean's block storage or Google Cloud's networked disks.
- ELB could be replaced easily with similar services from other providers [1] if you use Kubernetes (I don't know if all have a LoadBalancer type though)
I would be more concerned about firewall/vpc rules, because I have no idea how those could be migrated without risk of forgetting some. Lock-in seems not that high in the end though and even less so if you use an open source container orchestration stack because they abstract most of these things away.
[1] https://kubernetes.io/docs/tasks/access-application-cluster/...
P2P networks, each computer being a "data store" on the internet, no one entity can control data, etc to modern day centralized cloud where a couple of players control so much.
There has been a cultural shift. In the early 2000s, the idea of storing your data somewhere else would have been weird. But now, people don't care about keeping their data on apple/google/etcs data centers.
I think it has to do with the fact that computer/internet illiterate people are now the majority whereas in the 90s/early 2000s, it was generally the computer literate on the internet.
I think the reasoning was cloud accounts are easier for the masses than mapping a drive and accessing over VPN
If you can't build/run a better AWS replacement then it's a mute point, isn't it?
Then the question turns into if you can't build better AWS, can you architect your application to handle AWS failures? AWS itself lets you handle many kind of failures at AZ/DC level. Are you using that? For global AWS outages, can you have skeleton, survival critical system running on GCP or Azure?
Have you thought about outages that would be out of your control and out of AWS's control e.g. malware, DDoS, DNS, ISP, Windows/Android/iOS/Chrome/Edge zero day? How are you going to handle outages due to those issues?
If you are prepared to handle outages (communication, self-preservation, degraded mode, offline mode) then can a serious AWS outage be managed just like those outages?
It's like a cow's opinion, you know, it just doesn't matter. It's "moo".
Thanks. :)
Which I think is a merit of using VMs as opposed to individual services.
You can do that easily if you just treat clouds merely as hosted hypervisors and think entirely in terms of VMDKs. But this doesn't make commercial sense to do at least in the short term - you need to utilise the layered services you are paying for anyway or you might as well just run your own DC.
We have contingency against this via our own infrastructure but I worry about organisations who don't have any.
Source: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-reg...
Some of the traditional apps we host are vulnerable hypervisor failure be that rack, DC or region.
Then some businesses will be out for a few hours / days.
No big deal.
From WWII to 9/11 to Katrina (and whatever regional stuff we have), we have been through much worse than that in modern history.
The author states that a "snowball" is a grey suitcase with 50tb of HDD space inside, and a "snowmobile" is a massive 18 wheeler with what I would assume is petabytes of storage.
> Not the lumps of mush and ice that children chuck at each other, but Amazon’s portable information storage devices, big grey suitcases that hold huge amounts of data.
Capitalizing it might have helped, though.
https://azure.microsoft.com/en-gb/overview/azure-stack/
There might well be a commercial niche for providing Azure Stack hosting in non-Microsoft data centers.
Personally, I think MS crapped the bed a little by taking Azure Stack off of commodity hardware and onto a combined hardware/software solution. Being able to deploy Azure-compatible solutions piece-meal locally would be a massive boon to governments, healthcare operations, and anyone working on a more thorough migration to the cloud.
Most of the EU, for example, has privacy regulation that makes cloud hosting impossible in some situations. Having a 'local Azure' would make it highly reasonable have all apps architected around Azures components and technology. Without the local deployment though you're kinda stuck with each foot in a different canoe... Hybrid infrastructures are highly favorable to DevOps and multi-party development scenarios.
Basically, each 6 months DR testing was failing and it was accepted as harsh reality. After seeing how they're working inside, I don't think that moving their infrastructure to AWS/Azure/Google is worst that could happen.
disc: Currently working at Amazon, but not at AWS.
How do people who need to have more nines of availability manage this issue with cloud providers? (EC2 and RDS promise 3.5 nines per AZ, but I imagine outages are somewhat correlated across zones)
If you do go multi-cloud, I would be wary of picking regions that are located very close to each other. While you'll obviously get independent code and (likely) independent deployments, you're still susceptible to issues correlated with the physical location.
[0] https://medium.com/netflix-techblog/global-cloud-active-acti...
Users are patient enough to give you a pass if you're down that amount (especially if you're down that amount while 1/3rd of the internet is also down).
Our largest e-commerce retail site does over $1BB/yr in fairly high-margin sales and still targets "only" 99.95% availability (generally it exceeds that with actual results, but we don't target higher than that). It's a hybrid of on-prem and cloud services backing that, migrating towards the cloud, but will never be 100% cloud as we own and run factories with on-prem equipment.
(I know you asked "how" and I answered "whether", but I thought it relevant.)
Rabbit?
Plus, some people have huge, huge datasets. It could easily take weeks to migrate to, say, GCE, or to your own hosted servers. In the latter case, it would also necessitate a pretty large up-front investment.
The connections that could cause problems may not be obvious. For example network provider running into trouble because a ticketing or monitoring system that depends Amazon does not work. Hardware supplier not being able to ship spare parts for your on-premise SAN because logistics company runs into trouble due to issues at Amazon.
Their support is alright although you often have to pay for it but AWS docs are atrocious and remind me of university textbooks written by professors who like creating pseudo-scientific-sounding jargon which mixed with their huge array of features is quite un-comforting to use for even people with intermediate AWS experience (built some apps with AWS before kind of people).
I can see that there could be more specialized services like Firebase (which is built on Google Cloud) that should be built on AWS for the users. Firebase is a breeze to use and very responsive and I've used it to build real-time chat apps in a couple days.
Think Cloned bananas vs fingers disease but computers. http://www.bbc.com/news/uk-england-35131751
The very nature of AWS requires Amazon to build in capabilities to handle failover. But, as they say at Amazon, "everything fails, always".