The flip side is that the world still hasn’t settled on a language-neutral build tool that works for all languages. Therefore we resort to running arbitrary commands to invoke language-specific package managers. In an alternate timeline where everyone uses Nix or Bazel or some such, docker build would be laughed out of the window.
> running arbitrary commands to invoke language-specific package managers.
This is exactly what we do in Nix. You see this everywhere in nixpkgs.
What sets apart Nix from docker is not that it works well at a finer granularity, i.e. source-file-level, but that it has real hermeticity and thus reliable caching. That is, we also run arbitrary commands, but they don't get to talk to the internet and thus don't get to e.g. `apt update`.
In a Dockerfile, you can `apt update` all you want, and this makes the build layer cache a very leaky abstraction. This is merely an annoyance when working on an individual container build but would be a complete dealbreaker at linux-distro-scale, which is what Nix operates at.
Therefore I would rephrase your remarks as upside: let others continue scratch their head while others deploy working code to PROD.
I am glad there is a solution like Docker - with all it flaws. Nothing is flawless, there is always just yet another sub-optimal solution weighting out the others by a large margin.
That's not going to work if both parties get different hashes when they build the image, which won't happen as long as file modification timestamps (and other such hazards) are part of what gets hashed.
It's not just the timestamps you need to worry about. Tar needs to be consistent with the uid vs username, gzip compression depends on implementations and settings, and the json encoding can vary by implementation.
And all this assumes the commands being run are reproducible themselves. One issue I encountered there was how alpine tracks their package install state from apk, which is a tar file that includes timestamps. There are also timestamps in logs. Not to mention installing packages needs to pin those package versions.
All of this is hard, and the Dockerfile didn't make it easy, but it is possible. With the right tools installed, reproducing my own images has a documented process [2].
Personally I love using mkosi and while it has all the composability and deployment options I'd care for, its clear not everyone wants to build starting only with a blank set of OS templates.
Or do you mean a replacement for docker hub?
Want to throw a requirements.txt in there? No no, why would you even ask that? Meanwhile docker says yeah sure just run pip install, why should I care?
In Spack [1] we do one layer per package; it's appealing, but I never checked if besides the layer limit it's actually bad for performance when doing filesystem operations.
It seems overly orthogonal for the typical use case but perhaps just not enough of an annoyance for anyone to change it.
I wish we had standardized on something other than shell commands, though. Puppet or terraform or something more declarative would have been such a better alternative to “everyone cargo cults ‘RUN apt-get upgrade’ onto the top of their dockerfiles”.
Like, the layer/stage/caching behavior is fine. I just wish the actual execution parts had been standardized using something at a higher level of abstraction than shell.
Until you need to do something that isn't covered with its DSL, and you extend it with an external command execution declaration... At which point people will just write bash scripts anyway and use your declarative language as a glorified exec.
However, Dockerfiles are so popular because they run shell commands and permit 'socially' extending someone else shell commands; tacking commands onto the end of someone else's shell script is a natural process. /bin/sh is unreasonably effective at doing anything you need to a filesystem, and if the shell exposes a feature, it has probably been used in a Dockerfile somewhere.
Every other solution, especially declarative ones, tend to come up short when _layering_ images quickly and easily. However, I agree they're good if you control the entire declarative spec.
They sounded nice on paper but the work they replaced was somehow more annoying.
I moved over to Docker when it came out because it used shell.
Its a Buildkit frontend, so you still use "docker build".
And if you want something weird that's not supported by your particular tool of choice, you have the escape hatch of running arbitrary commands in the Dockerfile.
What more do you want?
I'd get much better results it I used something else to do the foreach and gave terraform only static rules.
But as long as people want to use scripting languages (like php, python etc) i guess docker is the neccessary evil.
I'll tell that to my CI runner, how easy is it for Go to download the Android SDK and to run Gradle? Can I also `go sonarqube` and `go run-my-pullrequest-verifications` ? Or are you also going to tell me that I can replace that with a shitty set of github actions ?
I'll also tell Microsoft they should update the C# definition to mark it down as a scripting language. And to actually give up on the whole language, why would they do anything when they could tell every developer to write if err != nil instead
Just because you have an extremely narrow view of the field doesn't mean it's the only thing that matters.
Interesting. How does go build my python app?
Then I found an HN comment I wrote a few years ago that confirmed this:
“[...] I remember that day pretty clearly because in the same lightning talk session, Solomon Hykes introduced the Python community to docker, while still working on dotCloud. This is what I think might have been the earliest public and recorded tech talk on the subject:”
YouTube link: https://youtu.be/1vui-LupKJI?t=1579
Note: starts at t=1579, which is 26:19.
Just being pedantic though. That’s about 13 years ago. The lightning talk is fun as a bit of computing history.
(Edit: as I was digging through the paper, they do cite this YouTube presentation, or a copy of it anyway, in the footnotes. And they refer to a 2013 release. Perhaps there was a multi-year delay between the paper being submitted to ACM with this title and it being published. Again, just being pedantic!)
We first submitted the article to the CACM a while ago.
The review process takes some time and "Twelve years of
Docker containers" didn't have quite the same vibe.
(The CACM reviewers helped improve our article quite a bit. The time spent there was worth it!)Here’s the announcement from 2013:
Well, before Docker I used to work on Xen and that possible future of massive block devices assembled using Vagrant and Packer has thankfully been avoided...
One thing that's hard to capture in the article -- but that permeated the early Dockercons -- is the (positive) disruption Docker had in how IT shops were run. Before that going to production was a giant effort, and 'shipping your filesystem' quickly was such a change in how people approached their work. We had so many people come up to us grateful that they could suddenly build services more quickly and get them into the hands of users without having to seek permission slips signed in triplicate.
We're seeing the another seismic cultural shift now with coding agents, but I think Docker had a similar impact back then, and it was a really fun community spirit. Less so today with the giant hyperscalars all dominating, sadly, but I'll keep my fond memories :-)
Funny comment considering lightweight/micro-VMs built with tools like Packer are what some in the industry are moving towards.
"Ship your machine to production" isn't so bad when you have a ten-line script to recreate the machine at the push of a button.
Wonder when some enterprising OSS dev will rebrand dynamic linking in the future...
I think it’s laziness, not difficulty. That’s not meant to be snide or glib: I think gaining expertise in how to package and deploy non-containerized applications isn’t difficult or unattainable for most engineers; rather, it’s tedious and specialized work to gain that expertise, and Docker allowed much of the field to skip doing it.
That’s not good or bad per se, but I do think it’s different from “pre-container deployment was hard”. Pre-container deployment was neglected and not widely recognized as a specialty that needed to be cultivated, so most shops sucked at it. That’s not the same as “hard”.
Minus the kernel of course. What is one to do for workloads requiring special kernel features or modules?
I sort of had the problem in mind. Docker is the answer. Not clever enough to have inventer it.
If I did I would probably have invented octopus deploy as I was a Microsoft/.NET guy.
Genuinely fascinating and clever solution!
[1] https://github.com/rootless-containers/slirp4netns
[2] https://blog.podman.io/2024/03/podman-5-0-breaking-changes-i...
[3] https://passt.top/passt/about/#pasta-pack-a-subtle-tap-abstr...
SLIRP was useful when you had a dial up shell, and they wouldn't give you slip or ppp; or it would cost extra. SLIRP is just a userspace program that uses the socket apis, so as long as you could run your own programs and make connections to arbitrary destinations, you could make a dial script to connect your computer up like you had a real ppp account. No incomming connections though (afaik), so you weren't really a peer on the internet, a foreshadowing of ubiquitous NAT/CGNAT perhaps.
That's a mistake indeed; "popularised by" might have been better. Before my beloved Palmpilot arrived one Christmas, I was only using SLIRP to ninja in Netscape and MUD sessions onto a dialup connection which wasn't a very mainstream use.
I don't recall whether you could technically open listening ports, at least for a single connection, using slirp, but many, if not all systems, limited opening ports under 1024 to superusers, which (would have?) made running servers on standard ports more difficult.
In any case, I'm glad that you pointed out ACM's apparent revisionist history. They should know better.
There was another component that we didn't have room to cover in the article that has been very stable (for filesystem sharing between the container and the host) that has been endlessly criticised for being slow, but has never corrupted anyone's data! It's interesting that many users preferred potential-dataloss-but-speed using asynchronous IO, but only on desktop environments. I think Docker did the right thing by erring on the side of safety by default.
(Half assed NOSQL 'databases' with poorly thought out storage models, everything having to be a microservice, turning every function call into a fallible RPC call etc...)
But I've come to appreciate it more, and i use it regularly now. I appreciate its relative simplicity.
But as in life, hell is other people's containers. My own I can at least try to keep them simple and minimal.
But I have seen many use the kitchen sink approach, giving me the feeling that even the developer don't seem to know how they arrived at their deployment anymore.
But this all seems quaint today. With LLMs, now we can look forward to a flood of code the developers haven't even looked at, but which is widely believed to work...
What I want to do when running a Docker container on Mac is to be able to have the container have an IP address separate from the Mac's IP address that applications on the Mac see. No port mapping: if the container has a web server on port 80 I want to access it at container_ip:80, not 127.0.0.1:2000 or something that gets mapped to container port 80.
On Linux I'd just used Docker bridged networking and I believe that would work, but on Mac that just bridges to the Linux VM running under the hypervisor rather than to the Mac.
Is there some officially recommended and supported way to do this?
For a while I did it by running WireGuard on the Linux VM to tunnel between that and the Mac, with forwarding enabled on the Linux VM [1]. That worked great for quite a while, but then stopped and I could not figure out why. Then it worked again. Then it stopped.
I then switched to this [2] which also uses WireGuard but in a much more automated fashion. It worked for quite a while, but also then had some problems with Docker updates sometimes breaking it.
It would be great if Docker on Mac came with something like this built in.
BTW are you trying to avoid port mapping because ports are dynamic and not known in advance? If so you could try running the container with --net=host and in Docker Desktop Settings navigate to Resources / Network and Enable Host Networking. This will automatically set up tunnels when applications listen on a port in the container.
Thanks for the links, I'll dig into those!
I want to avoid port mapping because I already have things on the Mac using the ports that my things in the container are using.
I have a test environment that can run in a VM, container, or an actual machine like an RPi. It has copies of most of our live systems, with customer data removed. It is designed so that as much as possible things inside it run with the exact same configuration they do live. The web sites in then are on ports 80 and 443, MySQL/MariaDB is on 3306, and so on. Similarly, when I'm working on something that needs to access those services from outside the test system I want to as much as possible use the same configuration they will use when live, so they want to connect to those same port numbers.
Thus I need the test environment to have its own IP that the Mac can reach.
Or maybe not...I just remembered something from long ago. I wanted a simpler way to access things inside the firewall at work than using whatever crappy VPN we had, so I made a poor man's VPN with ssh. If I needed to access things on say port 80 and 3306 on host foo at work, I'd ssh to somewhere I could ssh to inside the firewall at work, setting that up to forward say local 10080 and 13306 to foo:80 and foo:3306. I'd add an /etc/hosts entry at foo giving it some unused address like 10.10.10.1. Then I'd use ipfw to set it up so that any attempt to connect to 10.10.10.1:80 or 10.10.10.1:3306 would get forwarded to 127.0.0.1:10080 or 127.0.0.1:13306, respectively. That worked great until Apple replaced ipfw with something else. By then we had a decent VPN for work and so I no longer need my poor man's VPN and didn't look into how to do this in whatever replaced ipfw.
Learning how to do that in whatever Apple now uses might be a nice approach. I'll have to look into that.
As another commenter mentioned, Colima is a good alternative to Docker Desktop if you're looking. It doesn't expose container IPs either, but docker-mac-net-connect does support Colima ootb now.
"Docker, Guix and NixOS (stable) all had their first releases
during 2013, making that a bumper year for packaging aficionados."
Now we get coding agent updates every week, but has there been a similar year since 2013 where multiple great projects all came out at the same time?If you want the gains mentioned, you have to invest in governance: immutable tags, automated image scanning with Trivy, signing with cosign, and sensible image retention policies in your registry. Accept the tradeoff that you will be operating a distributed control plane and therefore need real observability like Prometheus plus request and limit discipline or you'll get the utilization benefits in graphs only while production quietly melts down.
When compared to a VM, yes. But shipping a separate userspace for each small app is still bloat. You can reuse software packages and runtime environments across apps. From an I/O, storage, and memory utilization point of view, it feels baffling to me that containers are so popular.
I've recently switched from docker compose to process compose and it's super nice not to have to map ports or mount volumes. What I actually needed from docker had to do less with containers and more with images, and nix solves that problem better without getting in the way at runtime.
Why do you think other tools will make a comeback?
Have others found this to be the case? Perhaps we're doing something wrong.
I have some hobby sites I host on a VM and currently I use docker-compose mainly because it's so "easy" to just ssh into the machine and run something like "git pull && docker-compose up" and I can have whatever services + reverse proxy running.
If I were to sum up the requirements it would be to run one command, either it succeeds or fails in it's entirety, minimal to no risk of messing up the env during deployment.
Nix seems interesting but I don't know how it compares (yet to take a good look at it).
I will say that consuming other people's services that I don't intend to develop on is easier with containers. I use podman for my jellyfin and Minecraft servers based on someone else's configs. My only issue with them is the complexity during development.
(article author here)
Apple containers are basically the same as how Docker for Mac works; I wrote about it here: https://anil.recoil.org/notes/apple-containerisation
Unfortunately Apple managed to omit the feature we all want that only they can implement: namespaces for native macOS!
Instead we got yet another embedded-Linux-VM which (imo) didn't really add much to the container ecosystem except a bunch of nice Swift libraries (such as the ext2 parsing library, which is very handy).
But the main benefit is the attack surface is greatly reduced when running a unikernel. Also we use way less resources and get really good perf.
For instance, deploying a complex Python application was hell, for lack of proper packaging. Using Vagrant was easy, but the image was huge (full system) and the software slow (full virtualization), among other problems. Containers like LXC and Docker were a bit easier to setup, much smaller, almost as performant as native packaging, and with a larger spectrum of features for sharing things with the host (e.g. overlay mounts).
No, it does not make it "very difficult".
And like other comments here grumble - this rationale is essentially a sanctification of the sentiment of "It builds and runs on my system and I can't be bothered to make fewer assumptions so that it runs on yours".
still, i use it every day and i don't see what replaces it. every "docker killer" solves one problem while ignoring the 50 things docker does well enough.
Buying more RAM for your server or only touching a select few images that are run most often is also a way to make things work. It might not be the most elegant software engineering approach, but it just works.
These are pretty handy to use
I want it not to just be invisible but to be missing. If you have kubernetes, including locally with k3s or similar, it won't be used to run containers anyway. However it still often is used to build OCI images. Podman can fill that gap. It has a Containerfile format that is the same syntax but simpler than the Docker builds, which now provides build orchestration features similar to earthly.dev which I think are better kept separate.
Linux user space decided to try and share dependencies. Docker obliterates this design goal by shipping dependencies, but stuffing them into the filesystem as-if they were shared.
If you’re going to do this then a far far far simpler solution is to just link statically or ship dependencies adjacent to the binary. (Aka what windows does). Replicating a faux “shared” filesystem is a gross hack.
This is a distinctly Linux problem. Windows software doesn’t typically have this issue. Because programs ship their dependencies and then work.
Docker is one way to ship dependencies. So it’s not the worst solution in the world. But I swear it’s a bad solution. My blood boils with righteous fury anytime anyone on my team mentions they have a 15 minute docker build step. And don’t you damn dare say the fix to Docker being slow is to add more layers of complexity with hierarchical Docker images ohmygodiswear. Running a computer program does not have to be hard I promise!!
The more recent half of my career has been more focused on ML and now robotics. Python ML is absolute clusterfuck. It is close to getting resolved with UV and Pixi. The trick there is to include your damn dependencies… via symlink to a shared cache.
Any program or pipeline that relies on whatever arbitrary ass version of Python is installed on the system can die in a fire.
That’s mostly about deploying. We can also talk about build systems.
The one true build system path is a monorepo that contains your damn dependencies. Anything else is wrong and evil.
I’m also spicy and think that if your build system can’t crosscompile then it sucks. It’s trivial to crosscompile for Windows from Linux because Windows doesn’t suck (in this regard). It almost impossible to crosscompile to Linux from Windows because Linux userspace is a bad, broken, failed design. However Andrew Kelley is a patron saint and Zig makes it feasible.
Use a monorepo, pretend the system environment doesn’t exist, link statically/ship adjacent so/dll.
Docker clearly addresses a real problem (that Linux userspace has failed). But Docker is a bad hack. The concept of trying to share libraries at the system level has objectively failed. The correct thing to do is to not do that, and don’t fake a system to do it.
Windows may suck for a lot of reasons. But boy howdy is it a whole lot more reliable than Linux at running computer programs.
Personally, I think docker is dumb, so is AppImage, so is FlatPak, so are VMs… honestly, it’s all dumb. We all like these various things because they solve problems, but they don’t actually solve anything. They work around issues instead. We end up with abstractions and orchestrations of docker, handling docker containers running inside of VMs, on top of hardware we cannot know, see, control, or inspect. The containers are now just a way to offer shared hosting at a price premium with complex and expensive software deployment methods. We are charged extortionate prices at every step, and we accept it because it’s convenient, because these methods make certain problems go away, and because if we want money, investors expect to see “industry standards.”
There’s another one, at least IMHO, that this entire stack from the bottom up is designed wrong and every day we as a society continue marching down this path we’re just accumulating more technical debt. Pretty much every time you find the solution to be, “ok so we’ll wrap the whole thing and then…” something is deeply wrong and you’re borrowing from the future a debt that must come due. Energy is not free. We tend to treat compute like it is.
Maybe I’m in a big club but I have a vision for a radically different architecture that fixes all of this and I wish that got 1/2 the attention these bandaids did. Plan 9 is an example of the theme if not the particular set of solutions I’m referring to.
Is there any insight into this, I would have thought the opposite where developers on the platform that made docker succeed are given first preview of features.
> It is also something developers seem to enjoy using
Count me out.
Docker made convenient distribute some functionalities which were planned as stack of services rather than packaging appropriately for each distribution and handling to an administrator the configuration. That’s it. And it has many inconveniences. And Linux as well as BSD had containers before and chroots and many other things.