I'm basing this off the documentation I've read, I haven't tried it myself.
Some people have considered another approach: flattening an existing stack of images. Scripts for that are linked from the Docker issue on GitHub. I wasn't able to get any of these working, and the logic behind these seemed quite convoluted.
Still, my script is just a proof of concept - I tried whether it is possible to take the approach I use internally for Docker build scripts, and use it to build Dockerfiles. It seems it's possible, and it delivers good results. Time and actual usage will show whether it's a good idea; if this approach makes sense, it will hopefully make its way to the Docker core, and my hack won't stay relevant for too long.
EDIT: from the [github issue](https://github.com/dotcloud/docker/issues/332#issuecomment-2...):
"Currently the only way to "squash" the image is to create a container from it, export that container into a raw tarball, and re-import that as an image. Unfortunately that will cause all image metadata to be lost, including its history but also ports, env, default command, maintainer info etc. So it's really not great."
Which makes sense to me, because you have no idea if an arbitrary shell command is deterministic or not.
So a good way to optimize your Dockerfile is to put commands in an order like:
* dependencies, e.g. apt-get, useradd...
* containers configs PORT, ENV, USER from less likely to change to more likely to change
* ADD commands
* final RUN commands to setup your image
Context: docker people have already announced an intention to work to unlink themselves from aufs dependency.
Alternatives/Reality-check: LVM2 can provide snapshots at the block layer: either through the normal approach with a single depth limit (though you can un-snapshot a snapshot through a process known as a merge, and then snapshot again as required), or through the new/experimental thin provisioning driver to get arbitrary depth (but 16GB max volume size). In both cases it's filesystem neutral, and the first approach is very widely deployed which means no roll-thy-own-kernel requirement. zfs and btrfs also provide snapshots, but are historically respectively poorly supported/slow (userspace driver or build your own kernel for zfs) and unfinished/in development (btrfs). Linux also supports the snapshot-capable filesystems fossil (from plan9), gpfs (from IBM), nilfs (from NTT). A related set of options are cluster filesystems with built-in replication, see https://en.wikipedia.org/wiki/Clustered_file_system#Distribu... Overall, the architectural perspective on various storage design options can be hard to grasp without digging, and higher-layer solutions such as NoSQL distributed datastore applications remain strong options in many cases.
Trend/future?: Containers in general are moving towards formalizing the "here's what I need: x-depth snapshots with y-availability and z-redundancy" environment requirements specifications for software. In the nearish future I predict that we'll see this in terms of all types of resources (network access at layers 2 and 3, CPU, memory, disk IO, disk space, etc.) for complex, multi-component software systems as CI/CD processes mature and container-friendly software packaging becomes normalized (we're already much of the way there for single hosts - eg. with Linux cgroups). Infrastructure will become 'smarter', and the historical disconnect between network gear and computing hosts will begin to break down. Systems and network administration will tend to merge, and the skillsets will become rarer as a result of automation.
* you have to preallocate the size of the snapshot back storage. * you create N snapshots of the same base block device. For each block changed in the base, each of the snapshots will get a copy on write block added to the snapshot backing storage. * you cannot resize snapshot (I mean logical volume size, not the storage area for cow data) * you cannot shrink snapshot backing storage
snapshot aware filesystems solve these issues. The slowness of ZFS you mention is only true for the fuse based toy driver. The license incompatibility between ZFS and the linux kernel is source of much confusion. All it means is that you cannot distribute linux kernel binaries linked with ZFS code (where a kernel module can be seen as parts of the linux kernel API linked with ZFS code). However nothing prevents you compiling the module on your machine, and there is a nicely packaged solution for doing this for, with support for distributions:
there is also a new place for promoting zfs: http://open-zfs.org
AuFS seems to me a rather pragmatic approach for those who don't need the advanced features and performance of an advanced filesystem, yet don't want to waste IO bandwidth just to provision a lightweight container.
* performance overhead of each layer, however small
* disk space for files removed in intermediate steps (scenario: ADD huge-ass source tarball, commit, RUN compile+install+remove, commit - user has still download the huge-ass source tarball to use the final image which doesn't have it)
* there's often just no need to publish intermediate layers; there may even be a good reason to not publish them (say, I distribute a program compiled with a proprietary compiler as a step of the build, but can't distribute the compiler itself)
* simplicity of having just one image for user to download and for publisher to distribute, rather than whole chain (this will be more important when we are able to use anything else than the registry to distribute images)
I guess a lot depends on other aspects of your project. For example, if you are looking at distributing frequently, and rsync is an option, then bandwidth concerns are effectively nullified. Likewise, disk space diffs for a few installs on a base filesystem are not big and thus not really expensive to keep. But I agree with you.
One aspect is crypto: signing a tarball is easier than a bunch of files.
The kind of people who use perl like this are generally much more advanced users of perl than the developers of the clumsy cgi's that might have formed your first impressions of perl.
Dancer, Moose, DBIx::Class (a bit more advanced), Plack/PSGI, etc etc. Not to mention the language changes they've made with 5.10/5.12/5.14/etc (I'm in love with the defined-or operator).
- Not exactly true, you can launch a shell in that one command, in interactive mode, so you're then able to run as many commands in that shell as you'd like.
Actually, I've just disabled VOLUME statement in the script, as it seems to be a no-op in Docker. Only trace it leaves in the image is setting image's command to '/bin/sh -c "#(nop) VOLUME [\"/data\"]"'.
Do people usually roll out their own images from source/based on verified binaries from the parent distribution's repositories or are base images provided by the community?
The place of trust here is the registry - usually, for convenience, tags are used rather than hashes (and I'm still quite not sure whether the long hex IDs are hashes, or just unique random names). The registry returns hex id for a given tag, and is trusted to deliver correct files for an ID.
I believe that the main index/registry runs over https and provides basic security, but it would be a huge issue if it was compromised. It's quite easy to run your own registry, too. What I'd love to see on top of that is some kind of GPG-based verification of downloaded images (Debian got the problem basically solved in Apt).
That allows me to work around a 'blocker' [1] for me right now: I can write a normal Dockerfile, use this ready made tool to test/build the image until my 'docker build' issue is resolved one way or another. Cool!