[1] Keeping buckets locked down and allowing direct client -> S3 uploads
[2] Using ALIAS records for easier redirection to core AWS resources instead of CNAMES.
[3] What's an ALIAS?
[-] Using IAM Roles
[4] Benefits of using a VPC
[-] Use '-' instead of '.' in S3 bucket names that will be accessed via HTTPS.
[-] Automatic security auditing (damn, entire section was eye-opening)
[-] Disable SSH in security groups to force you to get automation right.
[1] http://docs.aws.amazon.com/AmazonS3/latest/dev/PresignedUrlU...[2] http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/Cre...
[3] http://blog.dnsimple.com/2011/11/introducing-alias-record/
Also, S3 buckets cannot scale infinitely. This is a huge myth http://aws.typepad.com/aws/2012/03/amazon-s3-performance-tip...
If you don't have an elastic workload and are keeping all of your servers online 24/7, then you should investigate dedicated hardware from another provider. AWS really only makes sense ($$) when you can take advantage of the ability to spin up and spin down your instances as needed.
If we went with all of our own dedicated hardware, or cheaper instances from a different cloud provider then we'd miss out on ELB, have slower and more expensive communication to and from S3, not to mention that services like Elastic Beanstalk make deploying to EC2 instances very easy compared with rolling your own deployment system. And for those who don't want to bother with administrating databases and cache machines RDS and Elasticache are going to be cheapest and fastest if your instances are EC2.
So yeah I agree that EC2 is expensive, but the benefits of living fully within the Amazon ecosystem are pretty large.
I can see a lot of benefit to using S3 without EC2, but after that, I'm not sure what else would be possible. Care to elaborate more?
Can you use their queues and database tools w/o using EC2? (If you are using a VPC, maybe?)
If all you need is a server that is up 24/7, rent it by the month. You don't need information to make an educated choice, since they are pretty much all cheaper than EC2.
I doubt there are many founders who are technically informed enough to know about Amazon Web Services, but don't know about the other big 3 (Digital Ocean, Linode, Rackspace). If you truly don't, then you must not be a tech company, and I have a hard time believing a non-tech company without any technical founders would even know about AWS.
[0]: http://jud.me/post/65621015920/hardened-ssl-ciphers-using-aw...
There are other subtleties which make roles hard to work with. The same policies can have different effects for roles and users (e.g., permission to copy from other buckets).
IAM Roles can be useful, especially for bootstrapping (e.g. retrieving an encrypted key store at start-up), but only use them if you know what you're doing.
Conversely, tips like disabling SSH have negligible security benefit if you're using the default EC2 setup (private key-based login). It's really quite useful to see what's going on in an individual server when you're developing a service.
Also, it does matter whether you put a CDN in front of S3. Even when requesting a file from EC2, CloudFront is typically an order of magnitude faster than S3. Even when using the website endpoint, S3 is not designed for web sites and will serve 500s relatively frequently, and does not scale instantly.
Is the purpose of blocking 169.254.169.254 important because it could potentially give users access to the instance metadata service for your instance? I'd be interested to hear more information on securing EC2 with regards IAM roles, you seem to have lots of experience in that area.
The disabling SSH tip wasn't really about security (I agree that it has negligible security benefit), it's more about quickly highlighting parts of your infrastructure that aren't automated. It's often tempting to just quickly SSH in and fix this one little thing, and disabling it will force you to automate the fix instead.
The CDN info has been mentioned elsewhere too, lots of things I didn't know. I'll be updated the article soon to add all of the points that have been made. Thanks for the tips!
I make sure all HTTP requests in my (Java) application go through a DNS resolver that throws an exception if: ip.isLoopbackAddress() || ip.isMulticastAddress() || ip.isAnyLocalAddress() || ip.isLinkLocalAddress()
The last clause captures 169.254.169.254. Of course, many libraries use their own HTTP client, so it's easy to make a mistake.
I'm trying to bring my usage of IAM roles down to 0 as a matter of policy. Currently, I'm only using an IAM role to retrieve an encrypted Java key store from S3 (key provided via CloudFormation) and encrypted AWS credentials for other functions (keys contained in the key store). I'd be happier to bootstrap using CloudFormation with credentials that are removed from the instance after start-up.
Thanks for making updates. There are definitely some great tips in there.
What? CloudFront bandwidth costs are, at best, the same as S3 outbound costs, and at worse much more expensive.
S3 outbound costs are 12 cents per GB worldwide. [1]
CloudFont outbound costs are 12-25 cents per GB, depending on the region. [2]
Not only that, but your cost-per-request on CloudFront way more than S3 ($0.004 per 10,000 requests on S3 vs $0.0075-$0.0160 per 10,000 requests on CloudFront)
[1] http://aws.amazon.com/s3/pricing/ [2] http://aws.amazon.com/cloudfront/pricing/
For low bandwidth, you're absolutely right, the costs are at best the same. For high bandwidth however (once you get above 10TB), CloudFront works out cheaper (by about $0.010/GB, depending on region). But that wasn't taking into account the request cost, which as you point out, is more expensive on CloudFront, which can negate the savings from above depending on your usage pattern.
I'll update my post accordingly, thanks for pointing this error out!
Also, S3 buckets cannot scale infinitely. They have to have their key names managed appropriately to do it. http://aws.typepad.com/aws/2012/03/amazon-s3-performance-tip...
Finally :) I like SSH. But I'm the founder of Userify! http://userify.com
There's one that I think could be improved on a little:
Uploads should go direct to S3 (don't store on local filesystem and have another process move to S3 for example).
You could even use a temporary URL[0,1] and have the user upload directly to S3![0]: http://stackoverflow.com/questions/10044151/how-to-generate-... [1]: http://docs.aws.amazon.com/AmazonS3/latest/dev/PresignedUrlU...
Getting your application server up and running is the easiest part in operation, whether you do it by hand via SSH, or automate and autoscale everything with ansible/chef/puppet/salt/whatever. Persistence is the hard part.
"EBS volumes are not recommended for Cassandra data volumes."
http://www.datastax.com/docs/1.1/cluster_architecture/cluste...
I'll update the article soon to add in the new information.
The server-level monitoring is free, and it's super simple to install. (The code we use to roll it out via ansible: https://gist.github.com/drob/8790246)
You get 24 hours of historical data and a nice webUI. Totally worth the effort.
> Use random strings at the start of your keys.
> This seems like a strange idea, but one of the implementation details
> of S3 is that Amazon use the object key to determine where a file is physically
> placed in S3. So files with the same prefix might end up on the same hard disk
> for example. By randomising your key prefixes, you end up with a better distribution
> of your object files. (Source: S3 Performance Tips & Tricks)
This is great advice, but just a small conceptual correction. The prefix doesn't control where the file contents will be stored it just controls where the index to that file's contents is stored.to
@import url(http://fonts.googleapis.com/css?family=Droid+Sans:400,700);
you should notice an improvement in the boldface font rendering.Great article, btw.
Yes! Centralized logging is an absolute must: don't depend on the fact that you can log in and look at logs. This will grow so wearisome.
It's just a way to stop yourself from cheating and SSHing in just to fix that one thing, instead of automating it.
i don't want to learn some complex stuff like cheff/puppet btw.... anything SIMPLE?
For logging, try logstash? http://logstash.net/
Monitoring... well that's a large and complicated topic!
Does anybody else here agree with this mentality? This seems a major mispractice to me. I've worked at companies with as few as two people to as many as 50,000 people. None of them have had production systems that are entirely self-maintaining. Most startups are better off being pragmatic than investing man-years of time handling rare error cases like what to do if you get an S3 upload error while nearly out of disk space. There's a good reason why even highly automated companies like Facebook have dozens of sysadmins working around the clock.
I thought all of his other points were spot-on but this one rings very dissonant to my experience.
When developing an application for example, it's often necessary to SSH in to play with some things. But once you've ready to go to production, you want as much automation as possible. Forcing yourself to not use SSH will quickly show you where you aren't automated.
The idea is that if a user can't SSH in (at least not without modifying the firewall rules to allow it again), it will force them to try and automate what they were going to do instead. It worked well for me, but it's probably not for everyone.
And make it Wiki-ized.
Just go with a PaaS, like Heroku or AppEngine, and forget about this sysadmin crap.