"macOS native containers"
Cool, this sounds interesting.
"Disable System Identity Protection."
Eesh.
(More details here: https://support.apple.com/en-us/102149 )
WRT security implications of disabling SIP - I don't think OS becomes any less vulnerable than usual Linux/Windows installation.
Not every workload is running on an endpoint connected to a human via keyboard and screen.
Isn't this especially dangerous on a build worker? All your source code goes in and you (presumably) use the binaries that come out across the rest of your infrastructure. Compromising a build worker in a persistent fashion due to lack of SIP seems like it could do some serious[1] harm...
There's probably a hundred things that are not right just yet and they know it, let's not be overly negative
(although the discussion on what it is and what it does definitely is interesting)
This is the first thing I do on any Mac OS system before I start using it.
Wouldn't a Linux device, or Linux running on a Mac suit you better?
For me, the security picture is one of the main features of the eco-system even if it's very restrictive - disabling SIP undermines it more or less completely.
Edit: to be clear for the people who may not know, Apple Pay does not work with SIP disabled. ;P
Aaand, it's stillborn. Not happening.
Fundamentally, containers are about namespace/isolation of a bunch of OS interfaces, so file system functions, network functions, memory management, process functions, etc, can all pretend like they're the only game in town, but crucially without having to virtualize out the kernel.
Does XNU have such namespacing functionality across all its interfaces?
Furthermore, the existing container ecosystem assumes a Linux syscall interface. [1]. Does macOS provide that? I expect not.
The way Docker Desktop (and podman.io) implement "containers on macOS" is a bit of a cop-out: they actually run a Linux virtual machine (using Hypervisor.framework/hvf), and have that just provide the container environment.
Is that what this project is doing? But then, how could it run a macOS container?
[1] based on the foundation that Linux, unlike BSDs, has a stable syscall interface!
This is fine if you have a 32/64GB machine, but less so on an 8GB non-upgradeable laptop.
I get it - memory is relatively cheap these days - and manufacturers that are building memory-limited devices are really only doing it to fleece you on obscene upgrade fees at the time of purchase - but it would be nice if there was a more elegant solution to this on Windows and macOS.
WSL 1 had a solution to this that clearly took a lot of work to put together, wherein they'd have a Linux kernel running side-by-side as a Windows sub-process so that the memory pool was shared. Unfortunately it might have been too much work as they scrapped it entirely for WSL 2 and just used essentially the same VM route.
If anyone knows of any projects trying to work around that problem I'd love to hear about it. If Apple really wanted to bring the development community back on board, focusing on these kind of use cases would be great, sadly it seems someone over there has taken the view that scrapping butterfly keys and the touchbar is "enough".
Say what you will about Microsoft, but they've focused really hard on developer use cases for decades, and it shows.
Containers are namespaced processes. These processes exec against the corresponding kernel they require. There is no workaround: if you have an ELF binary calling Linux syscalls it can only run on a Linux kernel†, so to run that you need a VM††. It's not as bad as it appears thanks to memory ballooning†††.
Conversely if you want to exec a Windows binary in a container, the Windows kernel needs to provide process namespacing features (which it does). And if you want to exec a Darwin binary in a container, then the Darwin kernel needs to provide process namespacing features (which it doesn't).
† WSL1 was implementing the Linux syscall API on the Windows kernel, which proved to be much more complex than it appears to be.
†† Or colinux (https://en.wikipedia.org/wiki/Cooperative_Linux), or user-mode Linux (https://en.wikipedia.org/wiki/User-mode_Linux).
autoMemoryReclaim – Makes the WSL VM shrink in memory
as you use it by reclaiming cached memory
https://devblogs.microsoft.com/commandline/windows-subsystem...Obviously that limits the options, but I'll still be taking one last shot at using creative workarounds to tackle the memory problem in OrbStack (another containers-on-macOS product).
I imagine over time it will get smarter too. Right now it waits for no containers to be running for 30 seconds and enables resource saving mode but who knows what could happen in the future. Maybe it can internally profile and estimate load based on evaluating runtime stats of your contains and dynamically change the VM's resources on the fly and then expose a +% over provision threshold option or a way to turn off dynamic resource saver mode.
Edit: It has been a while since I last looked at this. Looks like containerd is, perhaps, a native option
When Windows containers are being used, it is just there to keep the docker daemon happy.
rund is an experimental containerd shim for running macOS containers on macOS.
rund doesn’t offer the usual level of container isolation that is achievable on other OSes due to limited macOS kernel API.
What rund provides:
- Filesystem isolation via chroot(2)
- Cleanup of container processes using process group
- OCI Runtime Specification compatibility (to the extent it is possible on macOS)
- Host-network mode only
- bind mountsExcept for bind mounts (not even overlayfs...) there isn't much interesting.
> - Host-network mode only
Yeah expect a lot of things to break in subtle ways... most containers are developed kinda expecting you have your own network namespace (and that no one else is using ports)
Essentially, like this:
https://ericchiang.github.io/post/containers-from-scratch/
https://earthly.dev/blog/chroot/
> The way Docker Desktop (and podman.io) implement "containers on macOS" is a bit of a cop-out
It's not, it's a requirement for running Linux containers: https://news.ycombinator.com/item?id=37656401
DfM is more like running the CLI locally to a remote Linux machine, and all it does is conveniently expose /Users in the same place through the VM folder share so that you have the convenient illusion that it happens locally.
If Darwin had process namespacing features it would not make it magically able to run Linux processes.
I don't think so, but some Docker features could be implemented using XNU sandboxing AFAIK
> Furthermore, the existing container ecosystem assumes a Linux syscall interface. [1]. Does macOS provide that? I expect not.
This project is about running macOS containers on macOS. It's not about running Linux containers.
> I don't think so, but some Docker features could be implemented using XNU sandboxing AFAIK
Theoretically, probably, for coarse-grained yes/no things? I don't think it's able to go much further than "you can use the local network and/or internet" and "you can read/write to the filesystem location corresponding to you bundle identifier `com.foo.bar`" but not "hey let me present you with a namespaced view of loopback or process list".
Also not sure if it can be dynamically set by a parent process for a child? Seems like it's very bundle oriented (except maybe for Apple processes) so not very practical.
There is more to the container ecosystem than Linux containers; Windows native containers function much the same way (well, in two ways, with VM-backing or the traditional kernel syscall interface, but with Windows syscalls).
1. Rely on system call stability. This is like Linux containers but unlike Linux macOS doesn't provide a stable system call API. So this would break when the system updates with a change that modifies the system call API.
2. Install the host libraries into the container at runtime. This should provide as much stability as macOS apps usually have. It may also be beneficial as you wouldn't be embedding these into every container.
It seems like 2 would be preferable. However it may be a bit weird when building as the libraries you build against would be updated without the container being aware, but this is unlikely to break anything unless they are copying them to new paths which seems unlikely.
I'm really wondering, do you have any links about macOS syscall stability over versions?
I get the point of isolation for build/test situations. But Apple provides a neat virtualization framework, and you get security + isolation + reproducibility + decent performance.
It seems like if you feel the need to containerize the userspace on MacOS you're using MacOS wrong. It's not the same thing as the Linux userspace, and doesn't have the same kernel features that would let you do so cleanly or performantly.
Orbstack is moving mountains to provide Linux-native perf and support for containers and it still makes me beg the question: why are devs allergic to just using Linux natively? At least I understand why Orbstack is useful, I don't know why containerizing MacOS itself is.
You also get limits on how many VMs your machine can run, each VM needs gobs of storage and locked-out RAM blocks, and sharing directories between the host and guest, compared to bind mounts, is something that makes me remember for my root canal dental jobs wistfully.
I can see how you'd need a crap ton of disk for MacOS virtualization, but again, why do you need it?
If it's isolation for builds, fix your build. If it's isolation for tests, live with it. If it's for running your app, write your app to properly run in the app sandbox.
Or is this just the fully open source Darwin core? That wouldn't likely be super compatible with a ton of production software? I need more explanation of what is actually going on here because it sounds like a good way to get sued.
1. This project didn't take explicit permission from Apple to redistribute binaries
2. There are multiple jurisdictions where you don't need to explicitly have such permission, it is implied by law
3. Usage of this software implies you already have macOS system. I'm not a lawyer, but it looks to be covered by section 3 of macOS EULA.
4. There are existing precedents of redistribution of macOS binaries for multiple years aready:
- https://github.com/cirruslabs/macos-image-templates/pkgs/con...
- https://hub.docker.com/r/sickcodes/docker-osx
- https://app.vagrantup.com/jhcook/boxes/macos-sierra
And so on.
Unless you're producing fully static binaries (or static enough that they don't bind to non-redistributable things) it'd be a yes (it would not be much of a container if it needed non-packaged things)
The screenshot points out a ghcr.io URL that lands on these packages: https://github.com/orgs/macOScontainers/packages?repo_name=m...
Edit: There's a note here†, so at least there is some consideration for licensing. No idea if it holds ground.
† https://github.com/macOScontainers/macos-jail/blob/9b1d5b141...
Note: I work at Earthly, but I'm not wrong about this being a good, free, arm64-native workflow for GitHub Actions.
https://github.com/macOScontainers/rund - new code
https://github.com/macOScontainers/moby - fork, 6 commits
https://github.com/macOScontainers/buildkit - fork, 4 commits
https://github.com/macOScontainers/containerd - fork, 5 commits
Would be interesting to see if they can get moby/buildkit/containerd changes upstreamed
Other part of containerd changes waits for gods-know-what: https://github.com/containerd/containerd/pull/9054
But I haven't gave up yet.
Sorry for that we had to revert #8789, but we are looking forward to seeing that PR submitted again with an alternative abstraction interface.
It's intended to prevent malware from changing system files due to rogue permissions or escalation. With SIP enabled, even the root/sudo user doesn't have rights to change these files.
It also refuses to boot a system with drivers that are not signed by Apple, so as to deter malware from using drivers as an attack vector.
Not really. «Secure Boot» is intended to secure the boot process through signature verification. However the security model is completely broken, https://arstechnica.com/information-technology/2023/03/unkil...
SIP is a protection layer which protects system files from modification also after the system is booted.
It clearly links to the GitHub where you can click to see all contributors
I suppose the answer to your question is “people who want macOS containers”, whoever they are. As far as malware, I’d employ whatever your standard practices are for installing GitHub projects
I bought an Apple Silicon machine after their presentation claiming that they would have first class docker support, but the reality has been that while the first docker worked well as it was translated, now it wants to default to arm containers and it has become very difficult to use because it doesn't want to use Rosetta 2 containers.
The whole point of using docker is to use the same containers in production as you use in development, so having docker default to these random arm containers means that my containers aren't exactly production, because they are arm based and the servers are not.
I understand that docker is the developer of docker software, but I really wish I could just click a button and force intel based containers in docker as the default and have to opt-in to arm.
If anyone has an easy solution to this let me know. I don't want to spend hours and hours figuring out docker on my mac.
macOS apps have to be signed and notarised to run without a warning, which is a pretty big part of the defence picture for this software - the certificates can be revoked at any time to block the software if malicious behaviour is identified.
However, if I install Homebrew, then install python, then install a pip package, there's really no kind of scanning/notarization/checking happening at all. I wonder if this is something Apple has ever looked into - it seems like the exact scenario where you'd want to sandbox it away from the rest of the system.
At the same time, I don't truly understand why anyone would need to use it. If your preference is to totally work with macOS, then I'm sure this would be perfect for that. Otherwise, what's the advantage?
VMs have really come a long way. Every major OS today has a virtualization framework that makes running another OS extremely performant. Docker on macOS uses a virtual machine, but so what? Performance of individual containers, in my experience, isn't really a problem unless you're doing something with the GPU, and even then there are ways to deal with that. Even a fully-emulated VM using QEMU (without hypervisor or KVM) won't have any noticeable performance penalties in many cases.
IMO, there's a much greater advantage to sticking with Linux. Even if the host isn't Linux, developing and deploying with Linux guests provides a tremendous level of consistency and portability.
But maybe I'll be proven wrong by this project someday soon!
Of course the drawback would be that the host would see just a fat Linux process and its child processes, much like you can see qemu, but it could be an interesting thing nonetheless, if even for shits and giggles of it.
linux is great. macos is great. windows is great too. for their intended purposes.
it’s horseless carriages all the way down.
Trying to break out of that is an exercise in futility.
Can you come up with situations where I would run a container instead of just running an app or sys service?
rund is an experimental containerd shim for running macOS containers on macOS.
rund doesn’t offer the usual level of container isolation that is achievable on other OSes due to limited macOS kernel API.
What rund provides:
Filesystem isolation via chroot(2)
Cleanup of container processes using process group
OCI Runtime Specification compatibility (to the extent it is possible on macOS)
Host-network mode only
bind mountsI believe this is the problem with the format of semantic version which seem to assume that releases only happen to software ready to be... released :)
My preferred course of action in such situations is not specify a version at all.
A normal version number MUST take the form X.Y.Z where X, Y, and Z are
non-negative integers, and MUST NOT contain leading zeroes. X is the
major version, Y is the minor version, and Z is the patch
version. Each element MUST increase numerically. For instance: 1.9.0
-> 1.10.0 -> 1.11.0.
So, no leading zeros, ta-da!Oh, wait. The spec was written by some... big brain:
Major version zero (0.y.z) is for initial development. Anything MAY
change at any time. The public API SHOULD NOT be considered stable.
So... my reading of this "definition" is that there's really no need for three digits, if major is zero... Then why on earth would you have two digits? Also, if no pubic API at this point, then why have versions at all? I mean, you clearly shouldn't be specifying anything with zero major version as a dependency because it should be illegal to depend on a library w/o public API... Then, again, why have versions in this situation? And if the argument is that its for internal use, then why standardize it for external use?Just two paragraphs below. How lovely.
Yep, this is primary goal of this project.
I would be very curious as to how you already run darwin containers.
The only alternative is spinning a macOS VM (including relying on macOS CI machines as a remote job executor)
Something like namespaces or proper jails on darwin would be super cool, but not at the expense of other security measures and chroot-ish outcome imho. Maybe this works for some, but not me :)
They’re not even trying, now.
launchd inspired systemd.
Spotlight (real time indexing and notification) is something I miss in Linux today.
64bit Unix layer on consumer hardware (G5).
All of that stuff was not a first ever implementation, of course, but it was well executed and led the way.
All of that was more than a decade ago.
macbook is the best laptop there is but macos...
can't wait for a stable release of Asahi and permission from corporate to install it even in a VM somehow. probably won't happen, but one can dream.