One aspect I intended to cover (and will now do so here) is that of cost. I get very frustrated by cash-strapped startups which from day one are expecting to be 'web-scale' and need to turn up machines at the drop of a hat. Lets get real. I HOPE you have to worry about that... much, much later.
I'm quite pleased you started your article mentioning solutions like heroku. Unless you have some special-needs not met by a PaaS, this is where you should start. You should be writing code, not managing servers (this coming from an operations guy who has been managing servers for 12 years, and worked at a webhost). Once you scale far enough that its worth hiring someone to deal with the knowledge required to deal with the maintenance of managing your application stack, OS updates, security, etc -- THEN move on. Not a moment before.
Cloud servers are realistically not the price/performance/low-maintenance solution for MOST startups. You should get a VPS (Linode) or dedicated server (from a reputable company which can offer quick SLAs on replacing parts - like 2 hours at voxel.net). Dedicated servers are cheaper than you think. I pay voxel $180/mo for an 8GB quad core 1TB box on 100mbit/sec. It outperforms servers costing twice as much in EC2 - and thats not counting bandwidth or storage. Concerned about reliability? Buy TWO - In different datacenters. -- You're STILL saving money, and you have the exact same level of maintenance overhead as AWS (OS, Updates, full application stack); while reaping the performance benefits of bare-metal.
You do NOT want the headaches of colocation. You cannot pay your staff enough to stay local to the server 24/7 and the cost of extra parts on hand to make up the money over dedicated.
Your startup is not Google, so I won't get in to having your own datacenter. (Well done pointing out that they're not getting advice from your blog)
EC2 may be expensive, but if anything goes wrong, I can boot another instance in seconds and abandon the old one. They fix it on their own time.
Sure, I could buy more servers so that one down one doesn't affect me, but that's twice as expensive. On EC2 I don't have any penalty for abandoning an instance for any reason. I've actually moved data centers in EC2 when the performance profiles were better on the other side for the same size instances. It was a temporary difference, but migration was pretty much painless.
You're voxel suggestion may have a 2 hour SLA for replacing parts, but what happens when they don't know what is wrong?
That said, I agree that a person should try for the PaaS and evaluate all the options fairly. I tried Linode, several dedicated hosts, and then finally moved to EC2 on a previous project.
We've been using Softlayer for years with ~100 servers and have never had this problem. No host is perfect but overall it's been a good experience and a lot cheaper than EC2.
The nice thing about going with a larger dedicated host like Voxel is that they have extra hardware & servers on hand.
In fact, they even have instant-provisioning of dedicated hardware: http://voxel.net/voxservers
I worked for, and have used dozens of dedicated hosts over the years (I spend far too much time on http://www.webhostingtalk.com ), and can totally agree that these are common issues with dedicated servers.
To each their own; just sharing my $.02 perspective on the situation ;)
I run my staging boxes on Ec2 and production on Linode. The small micro instances are good for staging/dev with just $15/month. No other provider I know gives you a VM at such low costs.
A lot of programmer overhead.
Compared to the amount of knowledge necessary to effectively and reliably run a dedicated server, it's honestly pretty trivial.
Or is it more about how it is all handled in the background?
VPS you typically get bandwidth, storage, etc. included - and its 'uncomplicated' - You pay one monthly fee, it covers it all.
AWS can be viewed as both fault tolerant, and additional probability of fault due to the extra complexity they've built in (they've gone down for substantial periods due to hiccups in this additional complexity).
A good VPS provider will make the particular system you're on fault-tolerant on its own: dual power supplies, RAID arrays, etc.
AWS doesn't care about making any particular machine fault-tolerant, because their model is that you should spin up a new instance, and throw away the one that failed.
Thats great for a 'web-scale' enterprise that has invested the resources making their site operational for that mindset. For early-stage startups where dealing with a failed instance means their site is down until they create or restart a new one; the advantage goes to VPS IMHO.
colocation was expensive and the hardware problems were all mine. i was pretty much tied to my local datacenter because i didn't want to ship a server around (which would be at least a day of downtime). pricing can be hard to compare because of power/space/bandwidth. if the equipment i colocated didn't have IPMI support, it could sometimes take up to a half hour to have a datacenter tech be able to put a remote console online when there were problems. at the end of it, i had a bunch of servers that were worthless on the resale market due to their age.
VPSes were never a serious option for the reasons stated in this article. it's impossible to track down performance problems when a dozen other VPS customers on the same server are taxing the CPUs and disks. i do use one that i pay $10/month for just to run a network monitor for some off-network perspective. they can be useful for single-task servers that don't need a lot of processing power like dns servers.
with dedicated servers, though, you can signup on a website and within a few hours have a complete server with modern CPUs, disks, and lots of memory assembled, tested, and connected to the internet with a remote console waiting for an o/s installation. when hardware goes bad, the server provider has lots of spare parts waiting around to be swapped in for free. and the best part of all, when you're ready to upgrade or move to a different provider, you just cancel the account and let the provider worry about what to do with the old hardware. i have a handful of these on various providers costing between $140-$190 a month for something like a core i5 ~2ghz with 8gb of ram, 2 big sata drives, and 100mbit ethernet with more than enough transfer every month.
This seems ridiculously cheap compared to something with comparable RAM/CPU on EC2.
Is there a catch?
I pay about $3 per month for a very low-end xen VPS. Sure, I'm not running anything at all resource intensive on it.. but if I was maybe I'd spring for a higher-end VPS for $6, or if I really wanted to get crazy I'd find a really solid one for about $20 a month.
At $140-$190 a month, you'd better be getting a bunch of high-end dedicated servers and fantastic support, or you are getting seriously ripped off.
Hell, given the very modest needs of most small startups, having to shell out even $20 a month is a ripoff.
However, I remember that hosting in the US is surprisingly expensive for some reason, and has been for many years. Don't know why.
Either you use Linux all day every day for some time, or hire someone to consult with you. That's my opinion anyway from years of consulting with people :)
With who?
one thing to watch out for with dedicated hosting providers is that the networks are sometimes not so great. you may get a great deal on hardware and tons of bandwidth, but if it drops packets all the time, it's not worth it.
make sure they segment you off onto a vlan (see http://jcs.org/mitm for why), make sure they are well peered, make sure they actually have staff at or near the datacenter they're running things from, and make sure they have the ability to block certain traffic from reaching you if you need them to (see http://jcs.org/sip for why) so it doesn't count towards your bandwidth total.
If you go this route you can just choose to run virtualization yourself with VMware ESXI or kvm and libvirt
You decide to run a site on 'shared' hosting (e.g. dreamhost, 1&1, godaddy (shudder), etc.) because it looks really cheap and they list so many "unlimited" things.
The good:
It's surprisingly pretty cozy for your startup wordpress blog about fish, although at that point you start wondering why you didn't just get a wordpress.com account instead, but you justify it by saying you were able to put your custom design theme with lasercats this way. You're not sure why your friends are saying the site feels slow.
The bad:
Oh, you like to program huh? Sorry our python version is 2 years behind. What's this ruby stuff you speak of? You're hosting user-submitted content on your bunk bed? Get out! Or upgrade to our overpriced "VPS" solution! You give up on the site and throw money away.
However the cheapest linode vps is $19.99/mo.
I would LOVE to have a vps to play around on, I would have so much use for it and yet I'm a student living in London and all my money goes towards the cost of living. It makes much more sense for me to go with the cheaper shared hosting as I just can't afford anything more.
As an aside, I found http://virpus.com/ the other day and they are selling vps's starting at $3/mo. Does anyone have any experience at all with them?
The real difference is not in price; it's that in shared hosting you don't get the option and associated responsibility of managing your software stack; in VPS you do.
They also give you some ability to "scale up" by simply asking for a bigger instance, and they handle moving your server over.
The only real downside is the default password. Unlike AWS, you are given a root password, which is pathetically easy to crack, so the first thing you need to do is change that.
Hell, my dad uses godaddy of all things, but at least he has his tiny wordpress site up to help his non-tech business.
Whorehouse.
It's really cheap ($20 for a VPS might be low for you, but it isn't for your nephew), doesn't matter that much when it's down, usually easy to set up, etc.
Even when it comes to performance, there's a world of difference between the average GoDaddy hosting plan and a medium-sized Linode.
Huffington Post, Gawker, BuzzFeed, CafeMom and AdMeld all co-lo with http://www.datagram.com/, most of them are within a few feet of each other.
Disclaimer: I am not affiliated with webhostingtalk -- just a user there.
Sorting out messy cabling or adding redundancy to a badly managed rack on live hardware is not fun.
[1] http://www.vibrant.com/cable-messes.php (for reference, this is what it should look like: http://royal.pingdom.com/2008/01/24/when-data-center-cabling...)
One option I have trouble fitting in, though, is the "run your own server locally". This might be #5, except it's often seen as actually a lower-class option than #4, rather than a step up: before you go all out with a colocated server, how about just a machine with Apache sitting in the office hooked up to your office's business-SDSL line?
I don't consider that an option for real hosting. There are a lot of reasons why it is bad.
Internet: DSL, or whatever your office has in probably not that reliable, and single-homed (your ISP goes down, so do you).
Cooling: Offices are not designed to cool servers. The AC gets turned off at night and on weekends. Airflow is bad.
Power: Once you start running more than a few servers you will need to add special wall/roof mounted AC. The combined power of the servers and AC will cost you thousands per month.
Need: If you don't need more than a few servers, you don't need your own servers. It will cost less to rent a little space on someone else's server (VPS).
Hard: As the article says, "Hardware is hard." You have all the downsides of Condo and Manor, with none of the upsides. Power, cooling, out-of-band console, internet, networking, backups, provisioning, monitoring, the list goes on. And none of it will be to the quality of a datacenter. It's a lot of time and money for nothing.
"But ask not for whom the pager beeps — for sysadmin, it beeps for thee." I'm the sysadmin, and I don't want my pager going off because it's a three day weekend and the temperature in the server closet reached 120F (I've seen the temp reach 120F in about 30 minutes when the AC failed).
Now if you can't afford any downtime, a local server is probably not a good idea, but then neither are many of the cloud alternatives. Also depends on size, of course; one or two local servers is a more reasonable proposition than 35 servers randomly thrown under desks. (Though the "so uh, does anyone remember which room 'thor' is in?" moment used to be a classic startup rite of passage.)
Good: You (almost) have complete control over everything
Bad: Mom accidentally unplugged the power while vacuuming
Can anyone recommend a good guide to getting started with colo? Obvious questions include:
Where do you go to buy a cheap server? Can you just have it shipped direct to the data centre, or do you need to configure it yourself? How does that even work for startups in a different country to the data centre? Is there anything I should look for in a data centre to make it easier? Do any offer out of band consoles? What sort of costs are we talking about? Is there a break-even point beyond which you really should colo instead of using VPS? Is there a detailed tutorial anywhere on "getting your first coloed server up and running without bricking the stupid thing and needing to spend thousands of dollars getting a data centre technician to fix it"?
And so on. :) It seems like there are a LOT of resources to hold your hand as you get up to speed with Linode-type services, but colo is dark magic.
How were you burnt w/ the dedicated server? IME, those are guaranteed NO downtime, and hardware failure is replaced at no charge and immediately , and a dedicated server failure takes top priority for the guys on the shift. Working in a fairly high pressure/low reward/24x7 environment is taxing, I sometimes miss it, but I like dev a lot more.
As far as guide to not bricking the thing: Our particular colo required DC power, which is what we configured our Dell with. Open up the box when we got it and -- aw, we can't turn it on. Luckily we had another AC power supply so we swapped that in (hurray modular!).
When you go to the colo, bring fuses, particularly if this is the first time you're wiring DC. Grab however many you think you'll need. Then grab some more because you'll futz it up.
Test your console (iLO, IPMI, etc) to make sure it's functioning before you leave. Most things at this point you can fix remotely.
As far as swapping hard drives and things go we decided to maintain that ourselves. RAID1 + Hot Spare on the OS to keep it running before you get over there to swap the drive.
But I can well believe it's better than many co-lo experiences, although all of mine have been positive. Right now I'm helping a friend with one: due to it housing Protected Health Information we pretty much have to run it on our own servers, which I built and he put into a co-lo that he's worked with before for this sort of thing.
Well luckily no backhoe but I did keep, at my own expense, Westell DS1 NIU's around because I had a situation where one went bad and it took the Verizon tech time to go and find one. So I bought a few so he would have the parts around. I also made sure when they strung the fiber from the connection point to our office (several hundred feet through other offices in the ceiling) that it was in orange conduit as opposed to just strung through the building (they were just going to run it like phone wire). I also had them bring an extra fiber through, as well as a fishing wire in case it was ever needed for anything in the future.
The OECD model tax convention, which has been adopted by many pairs of countries for their double taxation avoidance agreements (DTAAs) uses the term "permanent establishment" - "a fixed place of business through which the business of the an enterprise is wholly or partly carried on". If a business derives income attributable to a fixed place of business in another country, they will likely need to deal with the tax department of that country. At best, this will mean considerable expenditure (by a bootstrapped startups' standards) on legal compliance, and at worst, could mean considerably more tax is paid.
For hosting stages 1-2 in the article, no fixed place can be attributed to the business purchasing the hosting (CPU resources are shared and no CPU fixed CPU is assigned to the business). Disk space may temporarily be associated with a business, but this is under the control of the service provider - it is disk space and not physical disk sectors which are being rented, and the provider is free to move data.
However, stages 3-5 mean that a business has a very definite physical location (i.e. place of business) in another country.
These types of non-technical issues can often outweigh the technical ones in terms of business priority.
The definition is somewhat ambiguous, but it is likely that if there is an actual physical place (even if it is just a server) used to offer goods or services for sale to the public, there is a good chance that those sales are attributable to that server, and will be taxed in the country that server is located in.
This might not be a problem for an established corporation with the sales volume to justify the tax accountant expenditure and payment overhead needed to comply with tax law in multiple countries, but for a bootstrapped startup it can be an important consideration.
For example, a large majority of tech startups have a WordPress blog that is totally separate from their actual web application. In many cases it also drives the marketing front-end of their website.
So while my main application may be on Heroku or AWS, I like to fire up a free PHPFog (I don't work for them, you could use DotCloud, or likely another alternative as well) account, and have my WordPress install setup there. It's insanely easy to setup (it's all Git based), and the free account will get you a long way.
It's also nice to know that the WordPress install lives on an entirely different server, so if you get slammed with great press, your entire stack isn't feeling the heat. There are security fears here as well, so I like having it separate.
According to their website (http://success.heroku.com/) some pretty large websites run there, including Urban Dictionary and Rapportive.
Sure, it may cost more, but not more than a full time sysadmin and you are buying efficiency and flexibility. You can buy a lot at heroku for $10,000/month (the minimal cost of a full time deployment / sysadmin / dbadmin,) including I'd imagine some rather hands-on support.
This article seems to downplay the great advances that have been made in "cloud" deployment. IMO, a cloud service like heroku beats the pants off of self-operated virtual servers and debatably some of the higher "stages."
Is anyone claiming that? The linked blog post doesn't make that claim.
I also host on Heroku applications that I hope will grow, and if that's a bad choice, I'd like to know others opinions.
Having other people that can do something for you so you don't have to is a good thing. Having someone else to blame, if you work for yourself, is the downside to outsourcing, not the upside.
Outsourcing is great; but make sure you can always move to another provider if you have trouble with your current provider. Your boss might let you off the hook if a provider screws it up, but your customers won't.
For example, I built something like Mixpanel 2 years ago, but I never launched it because in load testing it really didn't take a very large client's worth of data to exceed the 4GB of RAM I could afford for a server, and hitting disk would make reporting far too slow. Buying a new server for each client (who may decide to cancel the next day) was not something I wanted to commit to. http://i.imgur.com/DAOEA.png
I've ended up at Softlayer after trying a number of hosting companies over the past 10 years. They want $25/mo/GB for RAM. It's almost like they want you to pay the one-time cost of the hardware every month... and then some!
Yet, supporting all this on my own, I can't really colocate -- I don't have the expertise or the money to get it off the ground, nor can I be there to drive to a data center and fix things if something breaks 24 hours a day. I have no employees. I have 60k users to support myself.
It seems like I'll be stuck here for a long time.
You do need a good host though. Quick e-mail replies and pro-active management of server load is absolutely vital. Happy to have found it (in NL) :)
I'm not sure if 40ms vs 160ms latency is so important for many kinds of web sites.
I'm considering a Linode VPS in Dallas.
What are other brazilians using?
Don't forget that 3yr reserved pricing is 48% cheaper than the on-demand costs, so once you know what your hardware reqs are on EC2, you can purchase some reserved instance and more or less cut your costs in half. Pricing out any hardware configuration on EC2 using the on-demand pricing is tear-inducing.
For Day 1 release, probably not an option. But at the 6-month mark you probably have a much better idea of what hardware your startup needs and can adjust accordingly.
2 CENTS: For the folks that need something better than a Micro and less expensive than a Large, don't forget about the Medium instances.
They aren't in the primary section but down in the "High CPU" section; they are an excellent fit for work that isn't quite big enough for a Large.
Something like Varnish you want a lot of RAM for, dedicated suits that well.
Web servers tend to be numerous and are just computational power (stitch all this gubbins together and return a string), their number vary according to demand and they suit virtual servers really well.
Now databases, these really need good disks, lots of RAM, decent CPUs. They are best dedicated or colocated. When things go wrong with a database server you really want to be able to rule out the invisible magic of other hosts, and the voodoo of being a virtual machine.
The best thing I can hope for whilst I scale is to find a provider that will sell me dedicated and public cloud instances that can live on the same VLan and still be reasonably priced.
I'm currently still totally with Linode, but with 9 instances, and an over-heated database server I know I'm getting close to the limits of what I can do there without re-focusing on splitting up the app when I could be adding new features.
Heroku places no requirements on your code that you wouldn't find through general best practise when building scalable applications. A lot of people will cite the read-only filesystem as a special requirement (which requires S3 or similar), but this is a common requirement with clustered systems. Yes you might have a local SAN that you can use as a local filesystem but the point is the same.
With the multiple applications I've deployed to Heroku I don't think any would not run on a 'regular' VPS as is. There's no Heroku specific code in there period. In fact, if I have changed my approach to better suit hosting on Heroku, it's generally been changing it to a better approach that would suit all types of hosting.
Has anyone had experience with a hybrid dedicated/cloud model? I'd love to stick our Postgres servers on dedicated hardware but then be able to spin web servers as cloud servers when needed for traffic spikes.
the homeless
-- (*.tumblr, *.posterous and *.wordpress.com)
the billboard space
-- stackoverflow, facebook, twitter$50/mo can get you a 4-core raid10 xen VPS which is plenty powerful and almost as flexible as dedicated.
A second hand Pentium II in the corner of the lounge, running Slackware with ports 80 and 21 forwarded to it from a cheap belkin router off your domestic DSL connection.
Good: Basically free (assuming the power bill is in your landlord's name) , host whatever the hell you like.
Bad: Someone might spill the bong water over the power strip and ruin your uptime.
Often these links are fan fiction but sometimes they are not.
Who is reading this stuff and why? What kind of behaviour does this inspire? And why do all Pinboard subscribers need to be exposed to this?