Do we need a better layer of abstraction, i.e. better adoption and tighter integration for something like kustomize? Have we fucked up completely with Kubernetes due to it being outrageously complicated for simple tasks? How we redesign this to be simpler? Is the complexity even a problem for the target audience?
I've no idea. I just know I'm a kubernetes admin and I can't write a deployment yaml without googling or copy/pasting.
I wouldn't be surprised if we eventually see new abstractions for "you just want a plain 'ol deployment with CI/CD du jour, a persistent volume claim, and a service ingress, just like 90% of all other CRUD webapps? Sure, here's a simple resource for that."
I think we'll start seeing a move towards more "opinionated" tools, just to outsource some of the decision making. No sense learning how to write your own pipelines if you can find a tool that says "we're gonna deploy every time you make a git tag and run `mvn package`, you figure the rest out".
* `helm create` to get the default scaffold
* modify a handful of entries in the generated values file
* done!
Only thing is the default helm chart starter does not allow for autoconfiguring of volumes, and since we're porting a lot of stateful apps to kubernetes we just modified the default starter to include that capability.
Of course it would be nice to not have to maintain a bunch of different virtually identical templates.
Pulumi looks interesting but unfortunately seems to insist on vendor lock-in (see the jerk-around on https://github.com/pulumi/pulumi/pull/2697). So I'm looking forward to the AWS CDK (https://aws.amazon.com/cdk/) maturing a bit.
Collectively, as an industry we have gone insane.
The rest are soon to follow I’m sure.
I really hope we move more towards these opinionated tools that can handle 90% of the use cases. Most people just want to host an app on a port, and it's a pain to have to develop that pipeline at every company that wants to adopt Kubernetes.
That is why in many businesses there is an OPS team managing the Kubernetes and providing tools like Cert-Manager, Istio, ... and the rest of the company who just use what the OPS team made.
Right now, everyone is building its own distro, proving IMHO the need for it.
All this is to say that you're at very least certainly half-correct, in that k8s is a very flexible tool that can be used to build a very simple, elegant, and ergonomic PaaS.
I'm not sure I agree that it's inappropriate for most businesses though, unless you think that only a PaaS like Heroku is appropriate for most businesses; the analogy I'd suggest is "Heroku vs. running your own VMs" circa 2010. Heroku is great for getting started, and lets you move fast by abstracting away a bunch of infra. But it's also restrictive; you can't pick and choose your components freely. As you grow past a certain point you'll almost certainly need the flexibility (or just cost-effectiveness) that you get from running your own infrastructure.
K8s is an improvement here because you can run a managed cluster on something like GKE, which takes away most of the operational toil, while still giving you a lot of flexibility on what components / pieces to include. The k8s domain API does a great job of abstracting away true infra concerns like volumes, compute, scheduling, load-balancing, etc, while making it really easy to package, use, and iterate on the stuff that sits on top of that infrastructure.
I'd probably not encourage a seed-stage startup to use k8s unless you're very familiar with the tool; a PaaS like Heroku would likely be more appropriate. However at the point that you'd usually graduate to running your own VMs (wherever that is on your own company's developmental path), I'd say that using k8s is now a better choice.
What Kubernetes allows you to do is very complex so you need a capable system to express it all. YAML isn't always the best but it works fine for most and tools like these are very helpful.
Do you use an IDE? Does that mean the language and framework is too complex? No, it's just a tool to help you get things done. More tools aren't a bad thing.
This is simply not true. Most complexity in today's systems is completely avoidable. Most developers are just mentally stuck and don't even try. "Managing" complexity is a great way to achieve job security without becoming good at anything specific. Instead of learning how to design system people are learning how to write config files.
Was Apache, Asterisk, or loading and hardening a Linux host on bare metal easier?
I seem to remember a lot of wrangling custom kernels to get Asterisk sounding just right, bizarre Apache, & network configs.
It’s just text? It’s always going to turn into a nebulous mess without literal edges and boundaries.
That’s Google’s play with it, IMO. Train tracks. Which is what I hate about it.
Google hasn’t built a less Byzantine text mess. It’s built hype though, with a boring tool
> Was Apache, Asterisk, or loading and hardening a Linux host on bare metal easier?
Yes, and by far. Adding a layer on top of all the traditional Linux daemons, tools and libraries does not decrease the total complexity - quite the contrary.
When you have a bug in an application that is related to something in on another layer you have to walk through the whole stack.
Examples: A bug in a network card impacting only large UDP packets. A race condition of file access triggered by NFS or a storage device driver. A vulnerability based on a timing attack due to CPU caches.
The deeper the stack, the worse.
I don't really get the love for hating on K8 complexity on here - nobody says it's the perfect tool for every use case... but when you do need it, it's amazing, and has many advantages over the current trend for serverless IMO.
What about Nginx or Apache configuration files? Could you write them without googling?
Easily, because noone writes them from scratch, they just modify the default one.
For most of my Kubernetes work I "kubectl describe" something existing into YAML, modify it, then "kubectl apply" it back again.
It provides the promise of portability, so the business feels less locked in to a vendor, while still being dense and impenetrable enough to them that IT won’t be done out of a job.
Simpler, more efficient alternatives are seen as more expensive, because IT hides the true cost of just how much of their salaries is spent of non-productive pottering around with yaml.
These systems and products also threaten to automate IT out of a job and don’t necessarily count much for their resumes. If you’re not a programmer what interest do you have in dumping all of your intricate, fragile environment for a PaaS? Not much
More tooling to write good k8s yaml sounds good to me.
Honestly, I get the pieces and I know where to look to get what so I'm not too bothered. The problem with wrapper tools is that sometimes I can't get at the insides and then I have to learn the wrapper tool. So I'm going to just stick to raw Kube until one of the wrappers wins out.
There's trade-off between a tool being too specific to a usecase vs being too generic with the associated "boilerplate". But I don't know how much simpler Kubernetes could be made while still covering the intended scope?
It has very simple basic model, which allows you to recursively built more and more complex abstractions on top of it.
- a dedicated editor with intelligent autocomplete
- stop using YAML, it soon becomes unreadable. JSON is easier to grok.
For anything non-trivial you will want inline comments.
Also while any JSON can be expressed in YAML, the reverse is not true.
Best way is to stop using both, and generate the objects from higher level language, at least something like Jsonnet (which is really just a step up, so better not stop there but I will take what I can)
A.K.A. a schema file that a general purpose editor can consume. Please don't make me use some single-purpose editor just for autocomplete.
[0] https://datree.io (Disclaimer: I work with them.)
If k8s changes notation having a layer of abstraction (hopefully) allows me to run the same command to generate new YAML (very helpful in automation)
Without going into the complexity of Deployments, consider the lowly Pod. What configuration does your app need? What is the name of the container that contains it? How much memory does it use? How much CPU does it need? What ports does it listen on? What HTTP endpoint handles the health check? Does that endpoint test liveness or readiness? What filesystems does it need? What setup needs to be done before the main container runs? Does it need any special resources like GPUs? The list goes on.
The problem here is that when you're writing a Pod spec, you're building a single-purpose computer from scratch. In the traditional UNIX world, people answered most of these questions for you. How much RAM can my app use? However much I plugged in. How much CPU can my app use? All of them. What ports does it listen on? Any of them from 1024-65535. What filesystems does it need? Whichever ones I setup in /etc/fstab.
I don't think it's a stretch to call UNIX's "yolo" approach problematic. It is great when you have one server running one app, but servers have gotten gigantic (with pricing to match) while applications have largely stayed the same size. This means you have to pack multiple apps onto one physical server, and to do that, there have to be rules. When you write a Kubernetes manifest, you are just answering every possible question upfront so that the entire system runs smoothly even if your individual component doesn't. It's the cost of having small apps on big computers.
The problem comes from applications that you didn't write, or don't fully understand. Before you can understand how the application behaves, you have to write a manifest. But you don't know the answers to the questions like how much CPU you're going to use, or what the worst case memory usage is, etc. This causes a lot of cognitive dissonance, because the entire file is you admitting to the computer that you have no idea how to configure it. No abstraction layer is going to fix that problem, except by hiding those uncomfortable details from you. (And you will always regret using the "yolo" defaults -- who hasn't tried to SSH into a broken server only to have Linux helpfully OOMKill sshd or your bash instance when you're just trying to kill your malfunctioning app.)
This is largely the fault of application developers. They aren't willing to commit to reasonable resource limits because they don't want to handle support requests that are related to underprovisioning. My experience is that applications that set limits pick them wrong. For example, GCP and DigitalOcean's managed Kubernetes offerings both install monitoring agents to support their dashboards; these apps ship with limits that are too low and any reasonable Prometheus installation will notice that they are being CPU throttled and warn you about it. Now you have to waste your day asking "is this a real problem?"
Many open-source apps go the other way and pick resource limits that truly encapsulate the worst case and require individual nodes that are many times larger than the entire cluster. Yes, it would be nice if I gave each pod 32 CPUs and 128GiB of RAM... but I don't want to pay $2000/month/replica thankyouverymuch. (I've been on the other side of that where resources didn't cost me real money and happily used terabytes of RAM as cache.)
Application-level configuration is also not in a great state. Everyone tries to sell you their curated defaults so they don't have to write any documentation beyond a "quick start". (I'm as guilty of that as anyone in fact!) The application will have some built-in defaults (so the developers writing the app can just "go run main.go" and get the config they need). Then someone comes along to make a Helm chart for you, and they change the defaults so that their local installation doesn't need any customization. This only causes problems because instead of an undocumented underlying application, now you have that AND an undocumented abstraction layer. You may find the answer to your question "how do I configure FooApp to bar?" but have no way of communicating that config through the Helm abstraction layer because the author of the Helm chart never thought anyone would do that.
This rant has gotten quite long so I'll wrap it up. No abstraction layer is ever going to make it so you don't need to answer difficult questions. The actual list of questions to answer is available through "kubectl explain pod.spec" and friends, however.
Tools like Helm solve this problem.
For production-y things however, some meta-config language that allows deterministic templateing would be a huge improvement. It allows you to make sweeping/uniform infrastructure changes from a single library or tool change.
Kubecfg is a good example of the basics one could implement [0] although it's examples aren't as fully fledged and organized as they could be.
[0] - https://github.com/bitnami/kubecfg/blob/master/examples/gues...
Although it is still in it's early days, it still is excellent to use and will only get better with additional tooling.
We can use TypeScript interfaces (which give us nice ide code completion) to define our yaml.
we can then create functions where we would normally duplicate Yaml. Really nice. https://www.pulumi.com/kubernetes/
This is the way to go for sure. I've done similar by generating CloudFormation from Python (I wrote my own library because I felt Troposphere was not very friendly nor a significant improvement over YAML).
Typing turns out to be pretty useful when you're generating YAML. While my library was fully typed, Python's type checking left a lot to be desired--many stupidly common things still can't be expressed (JSON, kwarg callbacks, etc), getting mypy to find/accept type annotations for libraries is stupidly hard, and the IDE integrations are pretty awful at this point. TypeScript users would enjoy a real leg-up here since its type system isn't half baked.
Yes and no.
Typing is a must, but a full-blown programming language is too powerful and all abstraction layers start to leak sooner rather than later. I always ended up with a "deployment" function that exposed almost all underlying functionality.
We're big fans of the Cue (https://cuelang.org) approach instead: https://cuelang.org/docs/about
A UI like this is useful to so many. It makes the experience of creating these YAML files easier. Thanks for sharing it.
Paul, as an aside, we absolutely love how feature packed Octopus is nowadays. We have been using it since 2.0, and I don't think we will ever give it up. Thanks for making one of the best tools we use a daily basis!
That said, this still looks cool. I just hope we won’t need a Kubernetes configuration generator generator anytime soon.
The bigger issue is that there's a lot of things you can do with it, and editing text YAML/JSON serialization of the objects is probably one of the least efficient ways of dealing with it, but it's also the "default" way people know.
It's much easier when your editor actually understands the object you're editing instead of at best helping you write correct YAML.
https://github.com/zegl/kube-score
https://stelligent.github.io/config-lint/#/
I'm obviously biased, but it's been hugely successful! kube-score is working very well out of the box, and there's only a handful of cases where the "ignore" annotation has been used to disable a check that's too strict for the particular use case.
Feel free to reach out if you have any questions or comments.
Aside from making it easy to generate k8s manifests, this could also be a great learning tool. If you allowed this to generate multiple resources that are linked, it could be a great illustration of how different resources fit together.
Clear schema (like TypeScript interfaces or something similar) which allows generating an UI.
[1] - https://kubernetes.io/docs/concepts/overview/kubernetes-api/...
[2] - https://github.com/kubernetes/kubernetes/blob/release-1.19/a...
[3] - https://godoc.org/k8s.io/client-go/kubernetes/typed/core/v1
[4] - https://kubernetes.io/docs/reference/generated/kubernetes-ap...