Kubernetes SidecarContainers feature is merged (opens in new tab)

(github.com)

217 pointsxdasf2y ago59 comments

59 comments

45 comments · 14 top-level

On the one hand, great.

The other hand, one of the main criticisms of Kubernetes is that it has no composition or orchestration capabilities. It's great about defining pieces of state, but managing blocks of state & multiple things at once is left almost entirely to external tools.

The ability to compose &sequence multiple containers feels like a very specific example of a much broader general capability. There's bedevilling infinite complexity to trying to figure out a fully expressive state of state management system - I get why refining a couple specialized existing capabilities is the way - but it does make me a little sad to see a lack of appetite for the broader crosscutting system problem at the root here.

NathanKP2y ago

Yeah I work on the team that builds Amazon Elastic Container Service so I can't help but compare this implementation with how we solved this same problem in ECS.

Inside of an ECS task you can add multiple containers and on each container you can specify two fields: `dependsOn` and `essential`. ECS automatically manages container startup order to respect the dependencies you have specified, and on shutdown it tears things down in reverse order. Instead of having multiple container types with different hardcoded behaviors there is one container type with flexible, configurable behavior. If you want to chain together 4 or 5 containers to start up one by one in a series you can do that. If you want to run two things in parallel and then once both of them have become healthy start a third you can do that. If you want a container to run to completion and then start a second container only if the first container had a zero exit code you can do that. The dependency tree can be as complex or as simple as you want it to be: "init containers" and "sidecar containers" are just nodes on the tree like any other container.

In some places I love the Kubernetes design philosophy of more resource types, but in other aspects I prefer having fewer resource types that are just more configurable on a resource by resource basis.

jauntywundrkind2y ago

Your approach sounds a lot like systemd's, with explicit dependencies in units coupling them to each other.

It's pretty cool how one can have a .device or what not that then wants a service- plug in a device & it's service starts. The arbitrary composability enables lots of neat system behaviors.

politelemon2y ago

As a consumer, ECS + Fargate is my happy path. I appreciate the lack of complexity. Thanks.

2 more replies

smarterclayton2y ago

In general, the intent here is to leave open room for just that.

dependsOn was proposed during the kep review but deferred. But because init containers and regular containers share the same behavior and shape, and differ only on container restart policy, we are taking a step towards “a tree of container node” without breaking forward or backward compatibility.

Given the success of mapping workloads to k8s, the original design goal was to not take on that complexity originally, and it’s good to see others making the case for bringing that flexibility back in.

perryizgr82y ago

I've a question that I've been wondering about for a while. Why does ECS impose a 10 container limit on a task? It proves very limiting in some cases and I've to find hacky workarounds like dividing a task into two when it should all have lived and does together.

orf2y ago

I like it this way to be honest. We needed to create a custom controller for Dask clusters consisting of a single scheduler, an auto-scaling set of nodes, an ingress and a myriad of secrets, configmaps and other resources.

It wasn’t simple, but with meta controller[1] it was relatively easy to orchestrate the complex state transitions this single logical resource needed and to treat the whole thing as a single unit.

I’m not saying Kubernetes can’t make simple patterns easier, but baking it into core leads to the classic “tragedy of the standard library” problem where it becomes hard to change that implementation. And the k8s ecosystem is definitely all about change.

1. https://metacontroller.github.io/metacontroller/intro.html

theptip2y ago

This is all true, and if you read the KEPs they were thinking about this. One camp was advocating for solving the problem of specifying the full dependency graph spec (of which sidecars are one case), another advocating for just solving the most needed case with a sidecar-specific solution to get a solution shipped. The latter was complicated by a desire to at least leave the door open for the former.

Pragmatism won out, thankfully IMO.

Edit to add: see this better description from one of the senior k8s maintainers: https://news.ycombinator.com/item?id=36666359

penciltwirler2y ago

There's lots of tools built on top of K8s to accomplish this tho. For example, Argo, Tekton, Flyte etc.

jauntywundrkind2y ago

Absolutely, no shortage of things atop. Helm is probably the most well used composition tool.

It seems unideal to me to forever bunt on this topic, leaving it out of core forever. Especially when we are slowly adding im very specialized composition orchestration tools in core.

2 more replies

hosh2y ago

Compositions of blocks of state may not end up with a more reliable software. Each of state management are controlled by independent processes that may interact with each other (example: horizontal pod autoscalers are not directly aware of cluster-autoscaler). The whole system is more like an ecology or a complex adaptive system than it is something you can reason directly with abstractions.

In the Cynefin framework (https://en.wikipedia.org/wiki/Cynefin_framework), you can reason through "complicated" domains the way you are suggesting, but it will not work when working in the "complex" domain. And I think what Kubernetes help manage is in "complex" not "complicated" domain.

0xbadcafebee2y ago

Orchestration of k8s wouldn't be necessary if they had made K8s' operation immutable. As it stands now you just throw some random YAML at it and hope for the best. When that stops working, you can't just revert back to the old working version, you have to start throwing more crap at it and running various operations to "fix" the state. So you end up with all these tools that are effectively configuration management tools to continuously "fix" the cluster back to where you want it.

I hope the irony is lost on no one that this is an orchestration tool for an immutable technology, and the orchestrator isn't immutable.

ed_mercer2y ago

You can use gitops (eg fluxcd) to revert to previous cluster states.

1 more reply

nrmitchi2y ago· 4 in thread

While this is a very welcome improvement in terms of functionality, I can't help by feel that the re-use of "restartPolicy" to mean something similar, but different, when used in a different context, is a very poor decision.

Kubernetes already has an issue with having a (perceived) high barrier to entry, and I'm not sure that "restartPolicy on a container means this, unless isn't used in this list of containers, in which case it means this".

I would have preferred to see a separate attribute (such as `sidecar: true`), rather than overloading (and in my opinion, abusing) the existing `restartPolicy`.

smarterclayton2y ago

The challenge with a separate attribute is that it is not forward compatible with new features we might add to pods around ordering and lifecycle. If we used a simple boolean, eventually we’d have to have it interact with other fields and deal with conflicting behaviors between what “sidecar” means and more flexibility.

The only difference today between init containers and regular containers is:

a) init containers have an implicit default restart policy of OnFailure, and regular containers inherit the pods restartPolicy

b) init containers are serial, regular containers are parallel

We are leaving room for the possibility that init containers can fail the pod, and be parallelized, as well as regular containers having unique restartPolicies. Both of those would allow more control for workflow / job engines to break apart monolith containers and get better isolation.

The key design point was that “sidecars aren’t special containers” - because we want to leave room for future growth.

0xbadcafebee2y ago

It's par for the course. Most of K8s's design has been shoving whatever crap they feel like in, regardless of confusion, difficulty, complexity, etc for the end user.

perryizgr82y ago

At some level it seems deliberate so that administration of the complexity can be sold to you for a price once you realise that you can't hack it on your own, but are now too invested to back out.

c7DJTLrn2y ago

I've been brushing up on my Kubernetes knowledge recently and came across so much gross stuff like this. "If field X is set to Y, then value Z for key V is invalid." Jesus christ. I wish they put more effort into approachability.

yla922y ago· 3 in thread

A very welcome change. It's gonna be helpful for the case where the database proxy (CloudSQL) and the main container got terminated out of order.

https://cloud.google.com/sql/docs/postgres/connect-kubernete...

giovannibonetti2y ago

That is very annoying. I remember having spent some time with this same issue in Google App Engine as well, which also runs Cloud SQL Proxy as a sidecar container.

https://github.com/GoogleCloudPlatform/cloudsql-proxy/issues...

numbsafari2y ago

The fact that this is needed for so many different things Google pushes, and it has been so slow to make it in, has been very frustrating and telling.

aranelsurion2y ago

Just FYI for people who don't know about it yet: with cloudsql-proxy v2 there's a new parameter called "--quitquitquit" that starts up an HTTP endpoint to be used for graceful shutdowns. Basically your main container makes a POST to this endpoint, and sidecar exits.

xdasfOP2y ago· 3 in thread

KEP: https://github.com/kubernetes/enhancements/tree/master/keps/...

TLDR: Introduce a restartPolicy field to init containers and use it to indicate that an init container is a sidecar container. Kubelet will start init containers with restartPolicy=Always in the order with other init containers, but instead of waiting for its completion, it will wait for the container startup completion.

hunta20972y ago

Hopefully these changes should make Envoy sidecars (and sidecar co-existence in general) more reliable.

tommiegannert2y ago

What's the use-case for Envoy as a sidecar? (As someone using Envoy Gateway.)

1 more reply

plagiarist2y ago

It seems like there could be a better marker for this. Maybe my skill with Kubernetes is too low for it to make sense.

CSDude2y ago· 2 in thread

It's a shame it took so long. If the main container shutdown (i.e connection drain, processing inflight queue items) takes a while, and your service mesh dies (nice go binary) and main container cannot communicate with internet anymore.

But I'm not sure about initContainers being used. init keyword implies it'd run and die in order for others to continue. Using restartPolicy with init instead of a dedicated sideCars field feels weird.

smarterclayton2y ago

We did that to leave open more complex ordering of both init containers and sidecars (regular containers do not have a restart order). For instance, you might have a service mesh that needs a vault secret - those both might be sidecars, and you may need to ensure the vault sidecar starts first if both go down. Eventually we may want to add parallelism to that start order, and a separate field would prevent simple ordering from working now.

Also, these are mostly init containers that run longer, and you want a sidecar not starting to be able to block regular pods, and adding a new container type (like ephemeral containers) is extremely disruptive to other parts of the system (security, observability, and UI), so we looked to minimize that disruption.

zeeZ2y ago

Without restart policy, a failing init container is retried forever. With a policy of never, the entire pod is marked as having failed. The init containers still have to run and succeed before the main pod continues.

sidcool2y ago· 2 in thread

Any documentation on this? What does this mean?

tecleandor2y ago

So, until now, a sidecar container was just the idea of running containers in you Kubernetes pod, along with your main service, that were 'helpers' for something: connection to databases or vpns, mesh networking, pulling secrets or config, debugging... But they didn't have special status, they were just regular containers in your pod.

This sometimes posed some problems because they weren't available for the full life cycle of the pod, notably on the init process. So if your init containers needed secrets, connections, networking... that was being provided via a sidecar container, you were going to have a hard time.

With this change, among other things, sidecars containers are going to be available for the whole life cycle of the pod.

There are other implications, probably, but I still haven't finished reading the KEP [0]. Check it out, and there you'll find its motivation and several interesting examples.

  0: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/753-sidecar-containers

Edit: corrected syntax

1 more reply

chdefrene2y ago

The KEP (Kubernetes Enhancement Proposal) is linked to in the PR [1]. From the summary:

> Sidecar containers are a new type of containers that start among the Init containers, run through the lifecycle of the Pod and don’t block pod termination. Kubelet makes a best effort to keep them alive and running while other containers are running.

[1] https://github.com/kubernetes/enhancements/tree/master/keps/...

AtNightWeCode2y ago· 2 in thread

When I first learned about the sidecar pattern I thought it was great. I am not sure about it anymore. Most of it could be propagated to custom images or layers at the boundary. To me this feels a bit sketchy. Too have containers that kinda is part of the mesh but then does not share the same lifecycle as the mesh.

verst2y ago

If you create a custom image you would need to create a complex health endpoint that is essentially only considered healthy if all the components baked into your image are considered healthy. This gets harder when you are not the author of the sidecar process on which you rely. With a single image it would be easier to run into a situation where the sidecar process (baked into your image) is in an unhealthy state but your container is not restarted because the app itself is not reporting unhealthy status.

AtNightWeCode2y ago

Monolith apps can have many dependency checks and it is not really an issue but I get your point. It can become messy. What I have seen gone into sidecars is TLS-termination, caching, authentication, service clients, metrics and logging. Things I would prefer to have in a dedicated proxy layer or in the images.

cacois2y ago· 1 in thread

In case anyone else was looking for a clear, concise summary of the new feature:

"The new feature gate "SidecarContainers" is now available. This feature introduces sidecar containers, a new type of init container that starts before other containers but remains running for the full duration of the pod's lifecycle and will not block pod termination."

rafaelturk2y ago

thank you!

fnord772y ago· 1 in thread

> Pod is terminated even if sidecar is still running

this is great for things like Jobs and Istio

eliminates the scheme where the main container had to signal to the sidecar it was exiting otherwise the pod would hang

jeremy_k2y ago

Yep, I was looking into running Jobs with Sidecars awhile back and came across this issue. I was actually surprised this morning to see a link on HN be in the "already read" state. Nice to see this feature merged, however our Cluster is on 1.25 I think? Probably a ways away from being able to use this.

tmzt2y ago· 1 in thread

Is there a clean way to share an emptyDir between sidecar(s) and main container(s)?

Looking at the logging usecase and want to be able to add a log shipper sidecar to a pod with ephemeral storage.

FridgeSeal2y ago

An easier solution for you might be something like vector-which will automatically harvest the logs from pods, and has excellent routing capabilities.

You wouldn’t need a sidecar-per-pod this way either.

raesene92y ago

Worth noting that this is hitting Alpha in Kubernetes 1.28, so won't be available by default at this stage.

If you've got self-managed clusters, it'd be possible to enable with a feature gate on the API server, but it's unlikely to be available on managed Kubernetes until it gets to GA.

1 more reply

sargun2y ago

This is great. My team at Netflix (I'm not longer there) sponsored some of the work behind this, via Kinvolk (now acquired by MSFT). Great to see that it finally shipped. At the time, this was a blocker to us using Kubelet, and we thought it might take a few...months to sort out. Turns out it was closer to a few years, but its a tricky API, and important to get right.

annexrichmond2y ago

The lack of native sidecar support was my biggest surprise when moving from ECS to EKS, and it was not fun hacking with shared process IDs to accomplish sidecars. I'm glad this is finally in but also curious how it takes roughly 3ish years(?) from KEP proposal to merge?

nodesocket2y ago

How does the syntax look for defining a sidecar in a deployment? Is it similar to initContainers?

j / k navigate · click thread line to collapse

59 comments

45 comments · 14 top-level

jauntywundrkind2y ago· 12 in thread

On the one hand, great.

NathanKP2y ago

Yeah I work on the team that builds Amazon Elastic Container Service so I can't help but compare this implementation with how we solved this same problem in ECS.

jauntywundrkind2y ago

Your approach sounds a lot like systemd's, with explicit dependencies in units coupling them to each other.

It's pretty cool how one can have a .device or what not that then wants a service- plug in a device & it's service starts. The arbitrary composability enables lots of neat system behaviors.

politelemon2y ago

As a consumer, ECS + Fargate is my happy path. I appreciate the lack of complexity. Thanks.

2 more replies

smarterclayton2y ago

In general, the intent here is to leave open room for just that.

perryizgr82y ago

orf2y ago

1. https://metacontroller.github.io/metacontroller/intro.html

theptip2y ago

Pragmatism won out, thankfully IMO.

Edit to add: see this better description from one of the senior k8s maintainers: https://news.ycombinator.com/item?id=36666359

penciltwirler2y ago

There's lots of tools built on top of K8s to accomplish this tho. For example, Argo, Tekton, Flyte etc.

jauntywundrkind2y ago

Absolutely, no shortage of things atop. Helm is probably the most well used composition tool.

It seems unideal to me to forever bunt on this topic, leaving it out of core forever. Especially when we are slowly adding im very specialized composition orchestration tools in core.

2 more replies

hosh2y ago

0xbadcafebee2y ago

I hope the irony is lost on no one that this is an orchestration tool for an immutable technology, and the orchestrator isn't immutable.

ed_mercer2y ago

You can use gitops (eg fluxcd) to revert to previous cluster states.

1 more reply

nrmitchi2y ago· 4 in thread

I would have preferred to see a separate attribute (such as `sidecar: true`), rather than overloading (and in my opinion, abusing) the existing `restartPolicy`.

smarterclayton2y ago

The only difference today between init containers and regular containers is:

a) init containers have an implicit default restart policy of OnFailure, and regular containers inherit the pods restartPolicy

b) init containers are serial, regular containers are parallel

The key design point was that “sidecars aren’t special containers” - because we want to leave room for future growth.

0xbadcafebee2y ago

It's par for the course. Most of K8s's design has been shoving whatever crap they feel like in, regardless of confusion, difficulty, complexity, etc for the end user.

perryizgr82y ago

At some level it seems deliberate so that administration of the complexity can be sold to you for a price once you realise that you can't hack it on your own, but are now too invested to back out.

c7DJTLrn2y ago

yla922y ago· 3 in thread

A very welcome change. It's gonna be helpful for the case where the database proxy (CloudSQL) and the main container got terminated out of order.

https://cloud.google.com/sql/docs/postgres/connect-kubernete...

giovannibonetti2y ago

That is very annoying. I remember having spent some time with this same issue in Google App Engine as well, which also runs Cloud SQL Proxy as a sidecar container.

https://github.com/GoogleCloudPlatform/cloudsql-proxy/issues...

numbsafari2y ago

The fact that this is needed for so many different things Google pushes, and it has been so slow to make it in, has been very frustrating and telling.

aranelsurion2y ago

xdasfOP2y ago· 3 in thread

KEP: https://github.com/kubernetes/enhancements/tree/master/keps/...

hunta20972y ago

Hopefully these changes should make Envoy sidecars (and sidecar co-existence in general) more reliable.

tommiegannert2y ago

What's the use-case for Envoy as a sidecar? (As someone using Envoy Gateway.)

1 more reply

plagiarist2y ago

It seems like there could be a better marker for this. Maybe my skill with Kubernetes is too low for it to make sense.

CSDude2y ago· 2 in thread

smarterclayton2y ago

zeeZ2y ago

sidcool2y ago· 2 in thread

Any documentation on this? What does this mean?

tecleandor2y ago

With this change, among other things, sidecars containers are going to be available for the whole life cycle of the pod.

There are other implications, probably, but I still haven't finished reading the KEP [0]. Check it out, and there you'll find its motivation and several interesting examples.

  0: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/753-sidecar-containers

Edit: corrected syntax

1 more reply

chdefrene2y ago

The KEP (Kubernetes Enhancement Proposal) is linked to in the PR [1]. From the summary:

[1] https://github.com/kubernetes/enhancements/tree/master/keps/...

AtNightWeCode2y ago· 2 in thread

verst2y ago

AtNightWeCode2y ago

cacois2y ago· 1 in thread

In case anyone else was looking for a clear, concise summary of the new feature:

rafaelturk2y ago

thank you!

fnord772y ago· 1 in thread

> Pod is terminated even if sidecar is still running

this is great for things like Jobs and Istio

eliminates the scheme where the main container had to signal to the sidecar it was exiting otherwise the pod would hang

jeremy_k2y ago

tmzt2y ago· 1 in thread

Is there a clean way to share an emptyDir between sidecar(s) and main container(s)?

Looking at the logging usecase and want to be able to add a log shipper sidecar to a pod with ephemeral storage.

FridgeSeal2y ago

An easier solution for you might be something like vector-which will automatically harvest the logs from pods, and has excellent routing capabilities.

You wouldn’t need a sidecar-per-pod this way either.

raesene92y ago

Worth noting that this is hitting Alpha in Kubernetes 1.28, so won't be available by default at this stage.

If you've got self-managed clusters, it'd be possible to enable with a feature gate on the API server, but it's unlikely to be available on managed Kubernetes until it gets to GA.

1 more reply

sargun2y ago

annexrichmond2y ago

nodesocket2y ago

How does the syntax look for defining a sidecar in a deployment? Is it similar to initContainers?

j / k navigate · click thread line to collapse