Um. Yeah. You're also creating a situation where, should anything possibly go wrong (now, how could that happen), you cannot hop on the server, diagnose, and possibly do some quick-try fixes. I can see limiting shell to a small trusted subset of colleagues (sorry, devs), but not eliminating it altogether.
Automated management, "system-as-code", devops, and all that has its place. The problem with virtually all existing systems for automated configuration management (cfengine, puppet, chef, Oopsware), is that they are highly non-transitive, and effectively bin years or decades of administration experience. They're fine once you've worked out the kinks, but as anyone who's worked with these tools can tell you, the one thing they're best at is screwing up all of your systems simultaneously.
"Etch" (http://sourceforge.net/projects/etch/) is one tool I've looked at briefly that appears to take a different tack, and is amenable to taking on-host changes and incorporating them into the configuration management system itself. One of the criticisms I've seen of it is that it's rather Linux and/or specifically Debian-centric, which may well be as Debian offers some very strong tools for managing, assessing, and maintaining system state (policy, APT, debconf). While dependencies can be painful when they keep you from doing what you want to do, as with most good safety systems, it's generally because you really don't want to go there (and if you do, there are means, within the framework provided by Debian, to get you there).
One way to avoid snowflake systems is to use tools that manage dependencies, stick within them to the greatest extent possible, and where that's not an option, to put your own modificiations within that same framework.
I just start writing code to provision/configure the machine for the role in question and don't stop until I have something that can reliably take a blank-slate server and have it rolling by the end of the function.
Easiest way to do devops I've seen yet. I didn't care for chef/puppet/cfengine.
Particularly since I can cherry-pick servers for testing pretty easily with fabric before I run the code against the rest of the machines.
I also use boto in my fabfile, so that I can run a single command to launch an EC2 instance, install & configure software on it, save an AMI and terminate the instance. It's a like having a Makefile that compiles machine images:
> fab build:<role> <region>
I wouldn't go as far as Fowler's "disable the shell" idea, but it doesn't take much discipline to do all significant changes by building new AMIs.However, it took me days to complete the script and make it moderately reusable (and completely idempotent and able to handle different configurations), and it still doesn't do as much as I want it to. For example, I just noticed that Ansible[1] can tell whether a change needs to restart a service (e.g. if a config file changed) and only then do it.
That's pretty useful, but my script can't really handle it.
My startup has been in production on EC2 for 6 months, and I think we've never had a server up for more than 3 days, and never booted a production box from an AMI more than 2 weeks old.
Once you have more than a few servers you will go crazy if you don't have a good configuration management setup. The real advantage of using a tool like puppet over something home grown is that you can hire someone to come in and manage puppet who will be able to understand how your automation works without having to have your system admin sit down and explain the 1000 little arcane perl scripts that make everything work.
We use client-server puppet, but require all the updates to be manually run, which makes rollouts a little bit more deterministic and avoids the "Hey we just changed puppet and everything broke" effect.
In an odd way, it feels like this is main advantage that PaaS offerings like Heroku have: they force you into the mindset of not relying on hand tinkering with the server config.