What's needed to combat DDoS attacks is distributed defense. Without their own backbone / private transport links between all of their locations, their network is just a disparate set of data centres and there is no advantage to their having multiple locations, so far as protection from DDoS attacks are concerned.
They also fail to mention what capacity each of the links are. They could be anywhere from 1Gbps to 100Gbps, but I presume they'd mention as a selling point anything 40Gbps and up, so let's assume they're using all 10Gbps links and not 1Gbps to give them the benefit of the doubt. So, they range from 50Gbps (Singapore) to 100Gbps (London) per location.
It's an impressive list to look at in aggregate, but not really that much for any one location in 2016, especially given a company of their size and visibility, when you can rent shared access to a 200Gbps+ botnet for $19.99. https://www.nanog.org/sites/default/files/20161015_Winward_T...
Instead of buying transit from up to 7 carriers per location, when there are starkly diminishing returns after 3 or 4 so far as routing performance is concerned, they should have instead bought higher capacity to each provider (to ensure at least 10Gbps of unused capacity per provider outside of regular legitimate traffic), external DDoS mitigation, or domestic backbone links and turned up more capacity at the LA Any2 (for Asia) and NYIIX (for Europe) to absorb the majority of DDoS traffic which comes from those regions. With up to 7 carriers, they simply have 7x different points of failure each at only 10Gbps, while getting worse deals on transit pricing due to lower volumes with each provider.
Linode is moving 200-300 Gbps globally. That is about 37.5 Gbps per location, and when you figure in a 20% utilization (because you need to be able to burst)... they have about 300 Gbps of transit per location. Spread across 3-5 carriers I would guess they have 40-100 Gbps from each. Way more than your estimated 10 Gbps.
As far as "routing performance" they appears to be buying from a few Tier 1 networks per location, and a mix of regional Tier 2s. That is in line with best practices. Sometimes to reach the right networks you do need to spin up circuits with multiple Tier 2s, there is no such thing as "diminishing returns" if you are doing traffic engineering properly.
The right way to build networks is to meet your performance needs first and foremost, have enough headroom to grow and serve your customers, and work with your upstreams to manage incoming attacks. An external scrubbing service makes no sense when you can adapt your network as Linode has done so they can easily blackhole targets at their upsteams edge.
I applaud their efforts. This is some smart network engineering.
We're actually moving to 5-10x that figure per location. We aren't playing around!
I'm quite skeptical of the numbers you're citing though. Their PeeringDB profile suggests they only have 10-20Gbps of peering per city. Considering their profile was updated just a few days ago, I would be inclined to consider those listed capacities accurate. Although they mention 'hundreds of Gbps' of capacity per city, they also mention sending up to 50% of traffic through peering in London, where they only have 50Gbps of total peering capacity. Perhaps you're right on that 300Gbps of capacity per location, in which case they would run much lower utilization rates on transits than peers. But that would be even worse allocation of spending than in my initial assessment, considering the cost of a port at an exchange is much cheaper than a transit link with CDR. It also leaves them highly vulnerable to DDoS attacks through exchanges.
For a content network, public peering at exchanges in North America just with route servers and networks that have open policies would result in 30-40% of traffic going through the exchange. They would easily do more traffic at the exchange than any one transit in a mix of 3-4, let alone in a mix of 5-7.
With any significant private peering, easily 60% of traffic could be settlement free. And guess what, most significant peers require peering at multiple locations with a full set of prefixes, which requires you have a backbone. With the traffic levels you're suggesting they run, they should be able to negotiate settlement free peering with many major regional Tier 2's, making it even less sensible to be purchasing from multiple ones. Considering most Tier 2's within a given region will all peer with each other, there are very few improvements to be had by turning up additional ones. Where there's the most room for improvement is being on the right long haul fiber paths. In which case, given their North American focus, they should be buying from Level3 and they probably could at comparable rates to their current agreements by concentrating more of their commits at fewer providers. If they had their own transport, they could also determine which fiber paths they take across their backbone to ensure optimal latency. Beyond that, given that Tier 1's all peer with each other by definition, it's just a matter of dumping traffic out any one of them for local traffic without traversing a congested peering link. The microseconds it takes to go an extra AS hop within a city has indistinguishable impact on performance.
I'm not sure what best practices you're referring to. Who else can you name that utilizes up to 7 transit providers in a given city, without operating their own backbone? The only one that I can personally think of Internap, when they abandoned building their own backbone halfway through turning it up. Ask their former network engineers, from their golden years when they had their highest market share, how that worked for their network and for their business.
Are you a current Linode customer in one or more locations? If you were, you'd probably have experienced packet loss issues on a regular basis due to DDoS attacks. There's a reason why they're performing these network upgrades; they've had near daily network interruptions due to DDoS attacks since Christmas of last year, with some outages lasting up to almost a day. Smart network engineering would have never let their network become that unreliable in the first place. And if you were going to blame a lack of budget for that, my suggestions would be even more appropriate for them as they would them to scale their network in a much more cost effective way. An external scrubbing service makes sense, when they've been ineffective at mitigating attacks to date. Your network is only as resilient towards DDoS attacks as your weakest links, and spreading out capacity to a larger number of providers instead of concentrating higher capacities with fewer ones makes it much easier to saturate connectivity to one of them.
The only way Linode's current network strategy makes sense, assuming that it's not due to technical oversight, is if it's marketing driven. That's a fair reason, but it should also be fair to call them out on it. I'm not sure why you feel that strategy is in any way optimal, when it's the opposite of the models of hosting companies most renowned for their networks. Take for example SoftLayer, who went to great lengths to build out their own backbone fairly early on. I may halfheartedly agree that Linode's network upgrade strategy might be smart marketing, but I would wholeheartedly disagree that it's smart network engineering. It's not cost effective, is sub-optimal for resiliency against attacks, and fails to leverage peering effectively.
I was responding to your statement that you needed a backbone to be able to deal with DDoS attacks. That is simply untrue. Most hosting providers announce separate space from each location and do not backhaul. If you do want to sink attacks closer to the source (which in itself only really makes sense if you have highly diverse POPs) you can GRE the clean traffic between sites.
Your PeeringDB profile indicates you push 10 Gbps at peak. As this is getting quite long in the tooth for a HN thread, email me next time you are in the bay area. I'd happily share some operational tips for running high volume and attack sinking networks that really require a whiteboard. Heck, maybe we can get someone from Linode to join us too for beers. :)
> they've had near daily network interruptions due to DDoS attacks since Christmas of last year, with some outages lasting up to almost a day
Whoa, nope[1].
> It also leaves them highly vulnerable to DDoS attacks through exchanges.
We aggressively de-peer with networks that regularly originate attack traffic, allowing us to size our ports according to utilization rather than worst case attack sizes. Multilateral peering is actually a bit more expensive than transit these days - much more expensive if you intend to significantly overprovision capacity.
> Who else can you name that utilizes up to 7 transit providers in a given city
That's fair, but missing context. We're figuring out who works best for our traffic profile. We will scale back/remove the underperformers and scale up those that prove their worth.
> It's not cost effective, is sub-optimal for resiliency against attacks, and fails to leverage peering effectively.
All of this is dramatically incorrect.
I am looking forward to that! Linode is my go-to hosting service, but it's a little troubling that anyone in the datacenter can hit your private IPs [1]. On the other hand, maybe it shouldn't matter, and you should always act like the network is compromised. Isn't trusting their private network how Google leaked traffic to the NSA? Still, it seems like a nice improvement that would make compromises less likely.
[1] https://blog.linode.com/2008/03/14/private-back-end-network-...
What about cases like AWS's VPCs?
Everything at Google Cloud is encrypted at rest and in transit [0]. Any GCE project is essentially a VPC by default, and a global one at that [1] (aka no need to VPN between regions). Traffic between GCE zones/regions never hits public wire by default ,and Google will carry your packet to the nearest Google POP around the world on its private backbone [2].
(work at Google Cloud, but not on networking/GCE)
[0] https://cloud.google.com/security/encryption-at-rest/
Edit. Never mind I see the link.
Granted not as big of deal as power plants etc, but if you're looking for soft targets, it's a scary thought.
Most datacenters have fences/gates, require access cards and/or biometrics to get in and move around inside the building. Once inside, you can only get into your own cages.
It's not like you can walk up, knock the door in with a battering ram, and then have access to everything inside.
Well, I mean, yes you can. The actual doors/gates used aren't 'milspec' intrusion rated. They're `better-than-home-depot` doors (all steel, steel door frames, reinforced). Cage door are often hilarious flimsy (thing metal sheet/bars).
Certainly breachable by even modestly equipped attackers.
The reason you pick real DCs and not the basement of your fortified house is because there's human security in addition to the physical security measures. Which means, some one will notice if you try to bust down the door
Imagine a large meteorite impact :).
https://jsonip.com is hosted with Linode and supports millions of requests a day. It's been a great home for the service.
I can't speak for them but I worked at what boils down to a semi competitor a decade ago doing network stuff. MPLS is old stuff now and you can google the specific cisco model numbers and MPLS if you'd like to read configuration guides.
Superficially only having 4096 VLAN labels on an ethernet connection appears to be a big problem if you have more than 4096 customers. However MPLS label space is 20 bits so you're good to a million customers.
Then you have some "fun" mapping games such that your router connects traffic on MPLS label 123456 (which is your customer number) to local ethernet interface port wtf on vlan 100 or whatever you have been given.
At least that would have been cutting edge a decade ago and probably still is today.
Its unlikely to be any more, or any less, secure than anything else in a virtualized cloudy environment.
802.1ad, a.k.a. "Q-in-Q" [0]
> However MPLS label space is 20 bits so you're good to a million customers.
VXLAN, cf. RFC 7348 [1], is the latest coolness, allowing for up to 16M "virtual" networks (using 24 bits) and bridging layer 2 over IP (4789/UDP).
I’m looking to replace one of my VPS at digitalocean because of stability issues (need to reboot the VM every couple of months, it just entirely drops off the network apparently).
Linode seems like a good alternative. My criteria for this application are ≥ 1G of RAM, SSD storage, fast RTT to my other VPS, native IPv6 support.