The other hand, one of the main criticisms of Kubernetes is that it has no composition or orchestration capabilities. It's great about defining pieces of state, but managing blocks of state & multiple things at once is left almost entirely to external tools.
The ability to compose &sequence multiple containers feels like a very specific example of a much broader general capability. There's bedevilling infinite complexity to trying to figure out a fully expressive state of state management system - I get why refining a couple specialized existing capabilities is the way - but it does make me a little sad to see a lack of appetite for the broader crosscutting system problem at the root here.
Inside of an ECS task you can add multiple containers and on each container you can specify two fields: `dependsOn` and `essential`. ECS automatically manages container startup order to respect the dependencies you have specified, and on shutdown it tears things down in reverse order. Instead of having multiple container types with different hardcoded behaviors there is one container type with flexible, configurable behavior. If you want to chain together 4 or 5 containers to start up one by one in a series you can do that. If you want to run two things in parallel and then once both of them have become healthy start a third you can do that. If you want a container to run to completion and then start a second container only if the first container had a zero exit code you can do that. The dependency tree can be as complex or as simple as you want it to be: "init containers" and "sidecar containers" are just nodes on the tree like any other container.
In some places I love the Kubernetes design philosophy of more resource types, but in other aspects I prefer having fewer resource types that are just more configurable on a resource by resource basis.
It's pretty cool how one can have a .device or what not that then wants a service- plug in a device & it's service starts. The arbitrary composability enables lots of neat system behaviors.
dependsOn was proposed during the kep review but deferred. But because init containers and regular containers share the same behavior and shape, and differ only on container restart policy, we are taking a step towards “a tree of container node” without breaking forward or backward compatibility.
Given the success of mapping workloads to k8s, the original design goal was to not take on that complexity originally, and it’s good to see others making the case for bringing that flexibility back in.
It wasn’t simple, but with meta controller[1] it was relatively easy to orchestrate the complex state transitions this single logical resource needed and to treat the whole thing as a single unit.
I’m not saying Kubernetes can’t make simple patterns easier, but baking it into core leads to the classic “tragedy of the standard library” problem where it becomes hard to change that implementation. And the k8s ecosystem is definitely all about change.
1. https://metacontroller.github.io/metacontroller/intro.html
Pragmatism won out, thankfully IMO.
Edit to add: see this better description from one of the senior k8s maintainers: https://news.ycombinator.com/item?id=36666359
It seems unideal to me to forever bunt on this topic, leaving it out of core forever. Especially when we are slowly adding im very specialized composition orchestration tools in core.
In the Cynefin framework (https://en.wikipedia.org/wiki/Cynefin_framework), you can reason through "complicated" domains the way you are suggesting, but it will not work when working in the "complex" domain. And I think what Kubernetes help manage is in "complex" not "complicated" domain.
I hope the irony is lost on no one that this is an orchestration tool for an immutable technology, and the orchestrator isn't immutable.
Kubernetes already has an issue with having a (perceived) high barrier to entry, and I'm not sure that "restartPolicy on a container means this, unless isn't used in this list of containers, in which case it means this".
I would have preferred to see a separate attribute (such as `sidecar: true`), rather than overloading (and in my opinion, abusing) the existing `restartPolicy`.
The only difference today between init containers and regular containers is:
a) init containers have an implicit default restart policy of OnFailure, and regular containers inherit the pods restartPolicy
b) init containers are serial, regular containers are parallel
We are leaving room for the possibility that init containers can fail the pod, and be parallelized, as well as regular containers having unique restartPolicies. Both of those would allow more control for workflow / job engines to break apart monolith containers and get better isolation.
The key design point was that “sidecars aren’t special containers” - because we want to leave room for future growth.
https://cloud.google.com/sql/docs/postgres/connect-kubernete...
https://github.com/GoogleCloudPlatform/cloudsql-proxy/issues...
TLDR: Introduce a restartPolicy field to init containers and use it to indicate that an init container is a sidecar container. Kubelet will start init containers with restartPolicy=Always in the order with other init containers, but instead of waiting for its completion, it will wait for the container startup completion.
But I'm not sure about initContainers being used. init keyword implies it'd run and die in order for others to continue. Using restartPolicy with init instead of a dedicated sideCars field feels weird.
Also, these are mostly init containers that run longer, and you want a sidecar not starting to be able to block regular pods, and adding a new container type (like ephemeral containers) is extremely disruptive to other parts of the system (security, observability, and UI), so we looked to minimize that disruption.
This sometimes posed some problems because they weren't available for the full life cycle of the pod, notably on the init process. So if your init containers needed secrets, connections, networking... that was being provided via a sidecar container, you were going to have a hard time.
With this change, among other things, sidecars containers are going to be available for the whole life cycle of the pod.
There are other implications, probably, but I still haven't finished reading the KEP [0]. Check it out, and there you'll find its motivation and several interesting examples.
0: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/753-sidecar-containers
Edit: corrected syntax> Sidecar containers are a new type of containers that start among the Init containers, run through the lifecycle of the Pod and don’t block pod termination. Kubelet makes a best effort to keep them alive and running while other containers are running.
[1] https://github.com/kubernetes/enhancements/tree/master/keps/...
"The new feature gate "SidecarContainers" is now available. This feature introduces sidecar containers, a new type of init container that starts before other containers but remains running for the full duration of the pod's lifecycle and will not block pod termination."
this is great for things like Jobs and Istio
eliminates the scheme where the main container had to signal to the sidecar it was exiting otherwise the pod would hang
Looking at the logging usecase and want to be able to add a log shipper sidecar to a pod with ephemeral storage.
You wouldn’t need a sidecar-per-pod this way either.
If you've got self-managed clusters, it'd be possible to enable with a feature gate on the API server, but it's unlikely to be available on managed Kubernetes until it gets to GA.