Docker will likely be more prevalent in a few years with startups who have built their infrastructure form the ground up.
That's great, if that's what you need. But most people aren't building a service like that. HN, I believe, runs on one machine, with a second for failover purposes. And HN still has many, many more users than typical company-internal services, community services, or at the extreme end personal services.
When you aren't operating at absurd scale, "Google-style" infrastructure doesn't do you any favors. But the industry sure wants to convince us that scalability is the most important property of infrastructure, because then they can sell us complicated tech we don't need and support contracts to help us use it.
(Disclosure: I'm the lead developer of https://sandstorm.io, which is explicitly designed for small-scale.)
The problem with Docker is not that it doesn't solve (or attempt to solve) widespread problems. At its best, Docker gives you dev/production parity, and dependency isolation which is useful even for solo developers working part-time. The problem is that it's not a well-defined problem that can be solved by thinking really hard and coming up with an elegant model—like, for example, version control—it's messy and the effort to make it work isn't worth it most of the time right now.
That's no reason to write off Docker though. Pushing files to manually configured servers or VPSes is messy and leads to all kinds of long-term pain. You can add Chef / Puppet, but it turns into its own hairy mess. There's no easy solution, but from where I stand, the abstraction that Docker/LXC provide is one that has the most unfulfilled promise in front of it.
The opposite seems likely ... Docker will fade and become deprecated as building infrastructure from the ground up locally to feed into the cloud becomes cheaper and cheaper still. AWS is not always so cost-effective when you truly dig in and crunch the numbers.
My guess as to why Docker won't succeed widely in production is because it's a software-based solution trying to glue together slippery pieces that just don't want to be glued together. The core issue of security will never be solved by a Docker-like solution; that problem is best solved by integrated hardware.
This very issue is being addressed in ClearLinux: http://sched.co/3YD5
With regards to docker/lxc/container security, you're right. Some of the biggest players haven't solved the lxc/docker/container security issues yet; its a really hard problem to solve. Breaking out of container will always be easier than breaking out of deeper levels of virtualization (Xen/KVM).
If you have a consistent level of traffic (i.e. you don't have inordinately wild upswings/downswings like e.g. Reddit), AWS isn't even remotely cost-effective. I was going to do the math to compare our current physical server infrastructure with AWS, and even if you factor in that physical servers need to be in pairs (for redundancy) and over-provisioned (for traffic spikes), I didn't even get as far as back-of-the-envelope math before it was obvious that AWS was completely infeasible.
(1) Testing.
(2) Build environments -- it's helpful to build distribution Linux binaries in older Linux versions like CentOS 6 so that they'll work on a wider range of production systems.
(3) Installing and running "big ball of mud" applications that want to drag in forty libraries, three different databases, memcached, and require a custom Apache configuration (and only Apache, thank you very much).
#3 is really the killer app.
This has led me to conclude that Docker is a stopgap anesthetic solution to a deeper source of pain: the Rube Goldberg Machine development anti-pattern.
More specifically, Docker is a far better solution than the abomination known as the "omnibus package," namely the gigantic RPM or DEB file that barfs thousands of libraries and other crap all over your system (that may conflict with what you have).
Well written software that minimizes dependencies and sprawl and abides by good development and deployment practices doesn't need Docker the way big lumps of finely woven angel hair spaghetti do.
Docker might still be nice for perfect reproducibility, ability to manage deployments like git repos, and other neat features, but it's less of a requirement. It becomes maybe a nice-to-have, not a must-have.
But... if my software is not a sprawling mess that demands that I mangle and pollute the entire system to install it, why not just coordinate development and deployment with 'git'? Release: git tag. Deploy: git pull X, git checkout tag, restart.
Finally, Docker has a bit of systemd disease. It tries to do too much in one package/binary. This made the rounds around HN a while back:
https://github.com/p8952/bocker
It demonstrates that at least some of Docker's core functionality does not require a monster application but can be achieved by using modern filesystems and Linux features more directly.
So honestly I am a bit "meh" about Docker right now. But hey it's the hype. Reading devops stuff these days makes me wonder if "Docker docker docker docker docker docker docker" is a grammatically correct sentence like "Buffalo buffalo buffalo buffalo buffalo buffalo."
Docker actually doesn't help reproducibility at all, because the underlying reproducibility problems present in the distro and build systems are used are still present. See GNU Guix, Nix, and Debian's Reproducible Builds project for efforts to make build truly reproducible.
I had a good laugh when I read "the Rube Goldberg Machine development anti-pattern". This describes the situation of "modern" web development perfectly. I'll add that such software typically requires 3 or more different package managers in order to get all of the necessary software. And yes, Omnibus is an abomination and Docker is much better.
I think Docker is papering over issues with another abstraction layer. It's like static linking an entire operating system for each application. Rather than solving the problem with traditional package management, Docker masks the problem by allowing you to make a disk image per application. That's great and all, but now you have an application that can only reasonably be run from within a Linux container managed by Docker. Solving this problem at the systems level, which tools like GNU Guix do, allows even complex, big ball of mud software to run in any environment, whether that is unvirtualized "bare metal", a virtual machine, or a container.
```The following packages are needed to run bocker.
btrfs-progs curl iproute2 iptables libcgroup-tools util-linux >= 2.25.2 coreutils >= 7.5 Because most distributions do not ship a new enough version of util-linux you will probably need grab the sources from here and compile it yourself.
Additionally your system will need to be configured with the following.
A btrfs filesystem mounted under /var/bocker A network bridge called bridge0 and an IP of 10.0.0.1/24 IP forwarding enabled in /proc/sys/net/ipv4/ip_forward A firewall routing traffic from bridge0 to a physical interface. A base-image which contains the filesystem to seed your container with.```
Is this the "well-written software" pattern that you're talking about? Because to me, this looks like a "big ball of mud" - i.e. dependence on an eclectic combination of libraries, co-programs, and environment configuration - and indeed, if for some perverse reason I felt like I wanted to deploy this in production, it's exactly the kind of thing I'd wind up writing a Dockerfile for. (Which, I notice is functionality "Bocker" doesn't attempt to replicate.)
I hate being passive-aggressive so I'll be directly aggresive here: this mentality is a way to say, "I don't want to revisit the operational aspects of my system because I don't like to do that work. Find someone else."
Like any aspect of your system, your ops and deploy components can rot. Pretending otherwise is outright ignoring a consistent lesson offered by those who came before and have failed over and over.
Docker offers to take over as a project many aspects of the system subject to bit-rot and make an explicit and consistent container abstraction for software to compose. While it has many features we do need (I agree wholeheartedly that it'd be great to parallelize layer creation, less so about secret exposure since the environment & volume tooling already can do that), it has also replaced whole categories of software and devops tooling with simple and extensible metaphors.
And then there's the part where Weave is slow, so you might as well stick to VMs or hardware...
http://www.generictestdomain.net/docker/weave/networking/stu...
He tries to demo me what they have currently and the damn thing timed out during login. I laughed.
The cost of these headaches is easily avoidable. Get off the ground and running first, pay the kind-of-premium Heroku bill, and when you're ready to really scale, make the switch.
There are few exceptions to managing an infrastructure, such as RackSpace, a cluster of AWS nodes, your own metal, etc. versus something like Heroku.
- 1 webserver/proxy, let's say nginx
- 1 simple Rest API server, let's say in flask
- 1 database, let's say PostgreSQL
and I want to connect all 3 things and I want to preserve logs for the whole time and preserve the state of the database (of course). Also not to forget make all bulletproof for the Internet.
And here all sorts of problems arise: What underlying OS, how to connect this containers, how to preserve state of my database and logs (it's not trivial as the article proofs again). So overall Docker makes life not easier on this simple use-case, it makes life (of the sysadmin) more complicated.
For example:
- What underlying OS? CF provides a minimal Ubuntu Linux "stemcell" and then has a standard "rootfs" for Linux containers
- a Python buildpack to assemble the container on top of this OS for your Flask server
- a built-in proxy/LB so you don't need one, if you want a static web server there's a static buildpack for Nginx
- an on demand MariaDB Galera cluster for your database if you want HA; PostgreSQL is there too but non-HA I think
- A standard environment variable based service marketplace & discovery system for connecting the containers to each other or to the database
- high availability (with load balancer awareness) for your containers at the container, VM or rack level
- reliable log aggregation of your containers (which you can divert to a syslog server).
As I said the only trouble is when you want to make this "bulletproof" is that there are a dozen "support VMs" are all there to make your app bulletproof and secure, e.g. an OAuth2 server, the load balancer, an etcd cluster, Consul cluster, and the log aggregator, etc. So it's overkill for one app, but good if you have several apps.
For single tenants and experimental apps, there's http://lattice.cf which runs on 3 or 4 VMs and is a subset of the above, but not what I'd call "production ready".
- 1 simple Rest API server, let's say in flask
Dokku - https://github.com/progrium/dokku
Can't really beat `git push deploy/uat`
- 1 database, let's say PostgreSQL
I just run PostgreSQL on the host and connect to it from the containers. Sure I could containerise PostgreSQL itself but I don't really see the point.
I then run my own Dokku plugin (dokku-graduate: https://github.com/glassechidna/dokku-graduate) for graduating my apps from UAT to production.
Databases are also tricky to run in containers, because even those with the best replication strategies can afford losing nodes but at a high cost (like re-balancing nodes, etc), and containers still don't have the stability to provide an acceptable uptime that's worth the risk.
On a side note, since you mentioned nginx and RESTful APIs, I would check out Kong (https://github.com/Mashape/kong) which is built on top of nginx, and provides plugins to alleviate some of these problems (http://getkong.org/plugins/).
edit: I guess you can cram all of the various Kubernetes master/etcd servers on a single node but whoops there goes reliability.
https://docs.docker.com/compose/ https://docs.docker.com/reference/run/#logging-drivers-log-d...
Once you understand how docker works, using the YAML file can become useful to lighten your load.
Multi-Host is moderately more difficult. A full orchestration and resource scheduling stack that scales with load even more so.
But you have to ask what your needs are if you're being realistic.
> Where to put logs
Well, I just throw them aside and use `docker logs [container]`
> How to manage state
One container should perform one service. I haven't run into a problem here.
> How to schedule containers
ECS :) But honestly, I subscribe to the approach that containers = services and thus should just always be running.
> How to inspect app
`docker exec -it [ container id ] bash` ("ssh" into container)
`docker logs`
`docker -f logs` (follow logs)
> How to measure performance
Probably same way you measure system performance
> How to manage security
Everything of mine is in a VPN; some services can talk to certain services over certain ports... Personally, I don't really understand all this talk about security. Protect your systems and that should protect your containers. Why is it that isolated processes are causing people to throw up their arms like security is an unimaginable in such a world? There are ways..
> Consistency across docker containers
This can be a pain if you need this, yea. They see to be adding better & better support to allow containers to talk to one another (and ONLY to one another).
> Ain't nobody got time for that.
Hmm, personally I don't have time to go thru what Puppet, Chef, and even Ansible require to get your systems coordinated. I see this as far more work than creating a system specification within a file and finding a way to run it on some system.
All comes down to requirements though and where your technical stack currently is at. To any newcomers who are also plowing into the uncertain fields of a dockerized stack, fear not! You are in good company and if I can make it work, you can too.
At the end of the day, you have to view it as building a reliable system that performs a function. Docker is one tool you can use to do that. Virtual machines are another tool. They don't solve all the problems you describe, nor are they intended to. If you're a tiny startup, you can just go the AWS route, but that leaves you beholden to AWS and their pricing. That's fine early on, but eventually you'll want to go full-stack for one reason or another.
[1]: http://mcfunley.com/choose-boring-technology-slides [2]: http://mcfunley.com/choose-boring-technology
That's not to say you're wrong; containers probably aren't that useful to most small shops. But that summary doesn't make any sense for this article.
Do me a favor and if you got a startup, stay clear of all this. Everyone wants to reinvent their own flavor of heroku and make your deployment and build pipeline god-awful complex. Their tool of choice? Docker.
Before you know it you'll be swimming in containers upon containers. Containers will save us, they'll cry! Meanwhile you have 0 rows of data before you've paid them their first month's salary and have spent time on solving problems of scale you'll never have.
Focus on your product, outsource the rest. And leave customized docker setups to mid-stage startups and big corps who already have these problems, or at least the money and people to toil on them. Not everything needs to be a container! And most companies are not and will never be Google!!
I quit the job.
The scenario played out just as you said: I ended up single-handedly and poorly re-engineering something that already existed (they did have a working Ansible setup) for no visible gain. "Swimming in containers upon containers" is exactly what happened; they kinda worked, but the farther we got, the more kludges piled on top of each other. In four months work we didn't even hit production - the most we got was a CI/QA service that was actually nothing more than a loose bunch of Python scripts. Between managing dev/test/prod differences, tracing missing logs, removing unused volumes, networking all that stuff together and trying to provide at least a decent level of security, I realized that I'm wasting everyone's time and money. Developers hated it because it filled their workflows with traps and obstacles. Admins hated it because of the lack of tooling. Business hated it because it caused unexplainable delays. The only thing we really accomplished was some compliance with the The Twelve-Factor App - something that could've been done in a week. Hardly a victory.
My advice? Forget about Docker unless your primary business is building hosting systems. It will take years before Docker gets mature enough for production, and not without a ton of tooling on top of it and some major architectural changes. Until then, go back to the old UNIX ways of doing things... it worked perfectly since the Epoch and it will continue to work long after the 32-bit time_t rolls over. You'll be fine.
The services in question are built with a hodge-podge of shell scripts and build tools, so getting them all to compile locally is a challenge, let alone deploying them. My hope was that containerizing the builds would isolate any configuration problems, and that containerizing the deployed services would cut down on outages by permitting trivial rollbacks (say, by snapshotting all the service containers before each deploy and merely restoring them should a deployment fail). Of course, all of the above could be fixed by traditional means (e.g. rewriting the build system with a single, standard tool; streamlining the deployment process, etc.), but it seemed like Docker could solve 80% of the problems while easing the implementation of the proper solutions down the line.
Considering the above, do you still think Docker's a poor fit for business that aren't building hosting systems? Oh, and any nuggets of wisdom you could throw to a newcomer to the industry? :)
> Focus on your product, outsource the rest.
What do you mean by outsource the rest?
Do you mean, "hey we're using AWS <Everything>-as-Service because we don't want to manage a DB cluster or deal with a load balancer?
Or do you mean, rely on existing available tools and stop reinventing the wheel every week?
iamleppert means: Identify your company's core competency and do that in house, but outsource or avoid that which is not your core.
For example, we're making a game. Gameplay, art, and tech is all done in-house and not with remote contractors because it needs to be -- it's the part of the product we love and the part our players will end up loving. Email, forums, chat, HR, applicant tracking systems, and git hosting are outside of our core and best handled by others.
Regarding the outsourcing, that's what we're shooting for at Giant Swarm. We've written a stack that runs containers and manages the metal underneath for you. We run the solution as a shared public cluster at giantswarm.io, but can also do private hosted deployments or managed on-prem deployments. It's a complete solution for running containers that feels like a PaaS, but without all the opinionated crap associated with a PaaS.
We're basically offering to be your little devops team that could - with containers.
Services like Cloud66 are interesting (they manage deployments onto your own EC2 or other cloud infrastructure), but the developer experience doesn't quite match Heroku yet.
Heroku really needs some more competition...
Leading up to Kubernetes v1, getting set up was quite painful but the api is now stable. So now I have an environment running Kafka, Cassandra, Spark and various java based services and am quite happy with it so far.
I've started using the hawtio kubernetes console to visualise my containers but may write my own tool to do this. I might also write a blog post detailing how to install, configure pods, expose a service and use the logs to diagnose problems. It's usually the first thing you'd want to know when starting out sound of GCE infrastructure.
It is well supported on AWS as well, and a variety of bare-metal solutions (e.g. Red Hat Atomic, CoreOS)
However, concretely, it is a challenge to maintain good support for N different platforms without an owner who is willing to stand up and ensure that it works, and continues to work for that platform.
We have gotten a number of drive-by contributions of "how to" guides that (sadly) bit-rot over time. As always, we're working on improving the situation, but it is complicated and requires a great deal of time and access to infrastructure (e.g. Rackspace) that the core team simply doesn't have.
I eventually wanted to set it up in Rackspace where my company cloud lives. I used corekube(https://github.com/metral/corekube) heat templates to set it up but the way to add more minions to the cluster wasn't simple. On top of that there's this complicated networking I have to setup on rackspace, because k8s runs in its own flannel subnet which is actually overlayed on top of an isolated rackspace private subnet. To access the API externally I had to do some NAT manipulation to interface that private subnet to the rackspace public IP just to get the guestbook example working. Dunno how I'd have managed that with a more complex setup of multiple services.
But then, if you don't feel like you need it, that's probably because you don't need it.
(If people are downvoting your question, it's probably because you're giving off a bit of a "I don't understand Docker so it must be crap" vibe, which is not helpful.)
Sorry if my initial question came across with a weird vibe. I'm generally curious. I have colleagues working at places and they actually are being asked to drop everything and implement Docker. I asked why and what's driving this and got the predictable response of "management/dev/someone wants something new".
Also, shout out to the fanboys for downvoting my question, which was just a question asking for thoughts and answers and didn't make any statement whatsoever.
I mean if your app needs the entire fucking OS to provide isolation from other apps, then you are clearly doing it wrong.
Docker could be much more successful in the Windows world, the ability to package very precise versions of databases, libraries, weird obsolete application into one image that can be deployed easily would be extremely helpful in many companies. It would be the wrong solution, but an easy work-around for broken upgrade paths.
Having containers able to package weird obsolete (unpatched) applications, specific (out-of-date) versions of libraries, and poorly-written homespun code is a recipe for exploits. The out-of-date version of the library (e.g. Java 7) likely has exploits out in the wild that have been patched in more recent versions. The weird obsolete application (e.g. DTS) likely not only has exploits patched in the active codepath, but has multiple bugs and integration issues. The homespun code likely reimplements something done better in another application or library, and introduces more bugs and vulnerabilities to the network.
Sorry for going off on this, but being able to repackage unsupportable applications would be a nightmare in places I've worked before.
Unfortunately full App-V is only for Windows enterprise customers.
Prove it. I'm not saying it's impossible, but it's certainly not trivial.
Also, take a look at what Joyent are doing with Triton.
It is also worth mentioning that since Joyent has implemented their own docker client, not all features are there yet. Last time I tried docker-compose didn't really work right yet. There is a full list of divergences on their github page. It has a lot of potential though.
see : bocker
If I have less than 50 (maybe even 100) EC2 instances for my applications there is no way in hell I am going to run 3 service discovery instances, a few container scheduler instances and so on and so forth.
For whatever it's worth, we completely agree with the sentiment (and I like your "blue collar apps" term) -- and we deliberately have designed Triton[1] for ease of use by virtualizing the notion of a Docker host. I think that the direction you are pointing to (namely, ease of management for very small deployments) is one that the industry needs to pay close attention to; the history of technology is littered with the corpses of overcomplicated systems that failed because they could not scale down to simpler use cases!
[1] https://www.joyent.com/blog/triton-docker-and-the-best-of-al...
Fortune 500 technology customers. Fortune 500 companies who have hundreds of millions and decades of work invested in their infrastructure generally aren't going to jump on whatever the latest infrastructure trend is.
You could go "old school" and have some (virtual) servers do more than one thing :)
But really, the most painful aspect of using Docker in production, at least in environments where you need multiple physical servers (or VMs) is overall orchestration of the containers, and networking between them.
Things are much better today than they were a year (or 6 months!) ago... but these are two parts of Docker configuration that take the longest to get right.
For orchestration: there are currently at least a dozen different ways to manage containers on multiple servers, and a few seem to be gaining more steam, but it feels much like the JS frameworks era, where there's a new orchestration tool every week: flynn, deis, coreos, mesos, serf, fleet, kubernetes, atomic, machine/swarm/compose, openstack, etc. How does one keep up with all these? Not to mention all the other tooling in software like Ansible, Chef, etc.
For networking: if you're running all your containers on one VM (as most developers do), it's not a big deal. But if you need containers on multiple servers, you not only have to deal with the servers' configuration, provisioning, and networking, but also the containers inside, and getting them to play nicely through the servers' networks. It's akin to running multiple VMs on one physical machine, but without using tools like VMWare or VirtualBox to manage the networking aspects.
Networking is challenging, but at least we have a lot of experience with VMs, which are conceptually similar. Orchestration may take more time to nail down and standardize.
I keep hearing about people putting Docker in dev and test environments and not production. This use case makes no sense to me as you would throw away the entire point of containers and have a wildly inconsistent path to production.
Relying on Puppet (as with prod) means development VM setup/change time is measured in hours. My company's Puppet catalog takes 15 minutes to compile, 6 hours to run. Entire days of developer productivity are lost trying to get development VMs working. Docker would make that instantaneous. It's also very hard to manage and synchronize data (i.e. test fixtures) across all those services. With Docker you could have a consistent set of data in the actual images and revert to it at will.
Even a simple `npm install` in a docker container fails on Windows because of the lack of support symlinks (adding --no-bin-links means npm's run scripts can't be used to their full and useful extent).
Throw in a database, a cache server, couple of versioned libraries your jar file needs, and more developers, and suddenly a reproducible image with all this packaged will make a lot of sense.
I have a database and a cache server. They don't run on the same server as the application jar... they run on separate machines tuned to their purpose. Why would I want them packaged together? So my team doesn't have to run "apt-get install postgresql" on their dev machines? Or to maintain an exactly consistent dev environment?
Officially Docker is only supported on RHEL 7 and up, and most systems I've seen are still on RHEL6.
I think its just a matter of time before Docker goes into Production, where I'm working we're seriously looking at "Dockerizing" lots of things, but OS support keeps popping up.
I really wish RH had found the time to fix RHEL 6 and support docker.
RHEL 7/CentOS 7 is a big step for many. RHEL 6 isn't even near EOL and many people (including myself) wanted to get more mileage out of CentOS 6.
I'm frustrated though because I keep pinging them about adding branch information to their (dockerhub) webhooks so I can actually deploy environments via branches.. It's crazy vital in my opinion and seems like it should be an easy fix, but 2 months later and still doesn't seem to be scheduled in.
Nevertheless, I'm sure Docker has its technical shortcomings but really, I wouldn't say it's not succeeding.. it's just young. Adoption takes time.
That said, what we do is we have our CI system build our docker images, push them to dockerhub (private registry) if the tests all go great, and then we deploy using https://github.com/remind101/deploy. We also tag all our images with the git SHA that they were created from, so we have immutable identifiers for each image, which has been useful.
We just recently put direct github deployment support in Empire, so that's been really nice (before we had to use another service that pulled deployments and put them into Empire).
Anyway, not quite the workflow you're talking about, but it's really worked well for us, so maybe it'd help you as well :)
But, dealing with all of the problems of deploying docker to production doesn't look worth the time investment for a medium sized company IMO(we're at ~1700 vms)
> Every major deployment of Docker ends up writing a garbage collector to remove old images from hosts. Various heuristics are used, such as removing images older than x days, and enforcing at most y images present on the host. ...
More specifically, I'm confused by this sentence, from the above paragraph in TFA:
> Most people discover their need by accident when their production boxes scream for space.
When did Docker become a replacement for Ops or Devops which are aware of their servers, monitoring systems that let you know when you're getting close to the "Yellow Alert" warning, and some sort of plan for growth and expansion?
Hardware isn't free, but it seems like some want Docker to make hardware free; delivering on the promise that full-OS VMs couldn't realize, which was trying to deliver on the promise that HT couldn't make happen. I'm sorry, but if you have 24 cores and 64GB of RAM, there's only so many ways to schedule and swap to maximize use of those resources.
---
Copy on Write (COW) sounds like Thin Provisioning. Thin Provisioning is known for 2 things: 1. Slower performance than "Thick Provisioning" where the entire allocated space is zeroed on allocation, instead of on write. And 2. Ease of overallocation - you can take a 100GB disk and create 10 virtual disks of 100GB each; this is like reserve banking, and it's only a problem if someone actually wants to use the entire resource that you say they can access.
I'm curious if they'll have an NTFS option. Actually, with Microsoft's recent open sourcing, I'd be interested to see NTFS open up a bit; maybe get an official Linux driver of some sort.
Will there be other write methods? Perhaps one that's more similar Thick Provisioning?
---
I'd hate to see VMs die. The flexibility and value they provide to the Microsoft world is unparalleled. I could see Dockers replacing Linux etc VMs -- no need to run CentOS(?) to host your LAMP stack when you can just have each letter in its own cluster of containers.
Maybe if each Windows component was rerolled as a container image; we could have Domain Controller (DNS/AD/LDAP/Kerberos/ACL) containers, IIS containers, SQL containers, DFS containers that were backed by SAN or NAS, FTP containers, TFS containers, etc. And there would have to be RDP/VDI containers, where users could remotely connect, and work in the environment with a desktop and GUI tools, since that is such a core part of the Microsoft ecosystem.
---
Looking at the Security and Image Layers and Transportation sections makes me realize how young this technology is. In a few more years, a few more iterations, and this could definitely replace numerous VM Appliances and Middleware devices. The time for Dockers and containers isn't quite today, but it's very close.
Hmmm ... tell that to AirBnb (http://nerds.airbnb.com/future-app-deployment/), New Relic (https://blog.newrelic.com/2014/08/12/docker-centurion/ ) and Spotify (https://blog.docker.com/2014/06/dockercon-video-docket-at-sp...)
It is a young technology, but moving quickly. Just 10 years ago, YouTube wasn't owned by Google yet and we didn't have the first iPhone.
Hell, 17 years ago VMware (arguably the king in the virtual machine software market) was founded. If it was a kid, it wouldn't have graduated from high school yet!
For instance there's still work being done to add native PAM and by extension Kerberos support, and the daemon runs as root, thus requiring extra caution about who may run docker commands.
If you're (for example) in an enterprise where developers may never have root access under any circumstances, you end up with a chicken and egg scenario: if developers don't have the ability to test container creation (because doing so might grant them root access in a container), who does?
In summary from a person in that scenario:
1. Not known of and too short of time horizon - People still run Windows XP in the real world. Changes where the rubber meets the road (IT and DevOps) take years of hard evidence, infrastructure cost, justifications, etc. to catch on. It does not behove these groups to be an early adopter.
2. Not flexible enough yet - I have a ton of use for this if I could run it more like a VM but faster and easier to deploy. I devop with a product that uses its own kernel... I tried to talk Dev in to compiling a kernel with Docker for a use case I have - you can guess where that went.
Docker is great, but I can only use it with my devs in its current state and for myself in specific cases.
Every service I deploy gets it's own VM (which is automatically provisioned/locked-down by a bash script), and they automatically update when a new revision is pushed to our production git branch.
It seems that docker is more useful when you have physical hardware? and/or lots of under-utilized infrastructure?
https://forums.docker.com/t/docker-export-intermediate-size-...
No one seems to know anything about it.
Also, when we upgraded from 1.6.3 to 1.7, devicemapper started having issues.
On top of the bugs, the limited networking support is very, well, limiting.
I would be very hesitant about using it in production at the moment. That said, I can also see the potentials and it seems to be heading in the right direction. It's just not ready at this moment.
The article mentions that "most vendors still run containers in virtual machines", presumably since if someone hacks an app in a container they might be able to break out of the container and access other apps running on that host. But clustering systems like Kubernetes, CoreOS, AWS Container Service, etc. seem to be all the rage these days and they seem fundamentally at odds with this. The cluster might schedule multiple containers on the same host at which point somebody who hacks one can hack all of them.
How do you reconcile this? Do people running these clusters in production typically run tiers of separate clusters based on how sensitive the data they have access to is?
It becomes as simple as asking what name the cluster should have.
It also makes sense from managing resource concerns to some extent, such as a cluster with cheap instances for low priority applications but need HA support or a cluster with beefy instances in a subnet that has fewer hops should be used for edge tier applications.
[0] http://blog.valbonne-consulting.com/2015/04/14/as-a-goat-im-...
There's lots of stragglers; sometimes it's a struggle to get people to even consider AWS for development boxes.
http://www.tomsitpro.com/articles/windows-server-2016-contai...
In the two hours I've spent with OSv, I've gotten much lighter weight VMs that boot my large scala app extremely quickly (a few seconds, max), with less configuration and more predictable performance.
Simple tools (rpm + yum + docker) allowed us to replace these people with a simple shell script. Literally.
I agree with the article Docker is missing some things. Two that I would like to see: - Auto cleanup - Clean and easy proxying
That means we're like a year away from it being boring and just working, right?
Docker's answer to storage so far has been "don't use Docker". That's their answer. Use volumes to map some other storage, but then you have to have some way of mapping storage to containers outside of Docker. Now you're really stuck.
Containers are awesome, but unless your product doesn't do work, you'll need to store data at some point. And that's when the magic stops.
It also does not link containers, instead opting to attach the database to the first IP address of the network Docker sets up, thereby avoiding the need for complicated service discovery. It also includes instructions on how to deploy Redis on the same box and use that with WordPress. Also includes instructions on how to do SSL for each site. It's being used in production.
Containers are only going to grow in uptake; companies like Weave and ClusterHQ have a very bright future if they can solve real pain points like the ones in this article.
Hmm, I don't think so. My reason is that, in addition to the maturation and feature growth of containers, there will also be feature growth in Puppet et al.
IMO any tool that does procedural run-time configuration like chef/ansible/puppet will generally be inferior to an image based infra management solution. (unless you're using said tools to build images - which is another ball of wax that will likely end up looking like a reimplemented docker)
The problem with procedural run-time config is that unless you blow away the VM, build from scratch, and run a test suite you don't really have good assurances your infrastructure is in a good state. With images, you have a bit for bit copy of what was built and tested in CI or QA. This is, for us, worth the price of admission.
Reproducibility implies being able to regenerate the full container including software version control and visibility of the full dependency chain all the way down to BLAS and glibc! You can't do that by using apt, rpm, Perl CPAN, rubygems, Python pip and the like. None of these package managers have been designed for true isolation of packages and full reproducibility. That is why today people go with Docker. The shortcomings of these package managers drive people to Docker.
The technology for regenerating exact Docker containers exists in the form of GNU Guix and/or Nix packages. The fun fact is that when using GNU Guix, Docker itself is no longer required.
Watch GNU Guix.
But that's as far as I will take it, Docker is mainly used (from what I've seen) as a nice way to package something without having to write an actual package (RPM/deb) that will work across multiple platforms (for the most part). If you take the time to learn how to properly package your application, docker is unnecessary in almost every case.
ill highlight the website is off by one... reading the website i have no idea how it works and what technical debt im adding to my teams stack by using it. "Build/Run/Ship", I'm doing that already. I have no idea if its using VMs or something else for containers. no idea if my hardware works on it. and no idea if the distros used for images are 1 year old or -nightly, so whos security issues am i inheriting?
Also moving around a 700MB+ image when you can deploy some Debian package (or even setup a virtualenv, I do mostly Python), sounds a waste of resources. Add to that that moving volumes around is still an issue and... well, Docker has a lot of potential, but I doesn't fit very well in any of the projects that I'm involved in.
I guess you can all call me a moron or close to it, to this day, I don't know what a container is good for.
I'm reading this now: https://www.docker.com/whatisdocker
Is this server level, user level, both? Something else? I saw a Hacker Con video and someone was basically containerizing all their apps so that the underlying OS basically did nothing but run the hardware and containers.
Does this mean, one day, instead of monitoring my installs on Mac OS X to see what files were put where so when something breaks I can find those files? I could simply install to a container, my OS would not be touched, and I could delete the container and be back to a stable state?
Can you even get OS X to run on a container? Would it be a good idea or even feasible to install PhotoShop in a container, or not even possible, as that app tosses stuff all over my OS.
Or, is this more like a different way to do things, but it is still AWS and their AMI's or Digital Ocean or any of the others.
I feel I am completely falling behind and have no idea why I would need one of these. Hell, if I made an iOS app, I have no idea if there are Apple servers you pay them to use for stuff, or if you deploy your own servers, or you use something like AWS, and how does that scale on demand, do you have to build that into your s=infrastructure, or is there a "auto scale as demand dictates" checkbox?
Is AWS, Docker, all the rest, in the end, is this just like in the old days where I would have 42U of rack space, put in a DNS server, put in a few http servers, use this as round robin image load balancers, DB servers, backup DB servers, replication DB servers. And when I needed to grow for heavy demand for a day, that was an issue. I fail to see how you can scale based on demand when a database is involved.
Databases scare me, one thing never talked about... We have git for code, what about databases. How does a dev team work that out? If you need a new field, drop some data, alter a table, add an index, etc. How do you get what you have on your local machine in test out to live? How is every little database change tracked and rolled back if need be. How do the DBA guys communicate with the coders to make sure a name change to a table gets updated in the code. Is there git for postgres and others?
All this made sense to me 5 yeas ago when the cloud was called the internet and email, ftp, http, etc were all part of the "cloud" or as I call it "the internet". But now, things are confusing, wrapped up in terms like "cloud" that make no sense to me. I have been, as we all have been, using "the cloud" for over a decade, the first time I logged into a slip account and got on some gopher server or similar, that was cloud to me and I believe it still is. POP email is cloud, the internet is cloud. It is now just convoluted to the developer so end users understand something that really only developers need to understand. Am I making the same mistake with containers, and they are nothing special and have been around for ages like the cloud and it is just a buzzword now? Apple has sandboxing and something called containers in their OS, is this similar in principle?
Auto makers don't burden their end users with engine shop talk, nor dumb it down for them into simple terms, but we seem to with tech. This should be it's own post but I just kinda got on a roll, sorry.