At previous job, we had migrated from a nasty cron orchestration system to jenkins. It did a number of things including building software, batch generating thumbnails and moving data about on around 30 nodes, of which about 25 were fungible.
Jenkins job builder meant that everything was defined in yaml, stored in git and was repeatable. A sane user environment meant that we could execute as user and inherit their environment. It has sensible retry logic, and lots of hooks for all your hooking needs. pipelines are useful for chaining jobs together.
We _could_ have written them as normal jobs to be run somewhere in the 36k node farm, but that was more hassle than its worth. Sure its fun, but having to contend with sharing a box that's doing a fluid sim or similar, so we'd have to carve off a section anyway.
However kuberenetes to _just_ run cron is a massive waste. It smacks of shiny new tool syndrome. seriously jenkins is a single day deployment. transplanting the cron jobs is again less than a day (assuming your slaves have got a decent environment.)
So, with the greatest of respect, talking about building a business case is pretty moot when you are effectively wasting what appears to be > two man months on what should be a week long migration. Think gaffer tape, not carbon fibre bonded to aluminium.
If however, the rest of the platform lives on kuberenetes, then I could see the logic, having all your stuff running on one platform is very appealing, especially if you have invested time in translating comprehensive monitoring into business relevant alerts.
As you say -- I think by itself "we want to run some cron jobs" isn't a good enough reason by itself to use Kubernetes (though it might be a good enough reason if you’re using a managed Kubernetes cluster where someone else handles the cluster operations). A goal for this project was to prove to ourselves that we actually could run production code in Kubernetes, to learn about how much work operating Kubernetes actually is, and to lay the groundwork for moving more things to Kubernetes in the future.
In my mind, a huge advantage of Kubernetes is that Kubernetes' code is very readable and they're great at accepting contributions. In the past when we've run into performance problems with Jenkins (we also use jenkins-job-builder to manage our 1k node Jenkins cluster), they've been extremely difficult to debug and it's hard to get visibility into what's going on inside Jenkins. I find Kubernetes’ code a lot easier to read, it's fairly easy to monitor the internals, and the core components have pprof included by default if you want to get profiling information out. Being able to easily fix bugs in Kubernetes and get the patches merged upstream has been a big deal for us.
Why wasn't the final sentence "and to re-evaluate if moving forward was even a good idea?"
Because I get nervous every time someone is relying on their patches to be included upstream. Or they need to dive in to the internals of something repeatedly. That screams "not production ready" to me.
After reading the post, Kubernetes did not sound at all like a slam dunk in terms of a solution, let alone a foundation for more mission critical infrastructure. The Jenkins solution offered by the parent sounds more reasonable, even with the objections you list.
Edit: Take my comments with a grain of salt, but from internet armchair vantage point it does sound like Kubernetes was chosen first, and rationalized second. (Though I very much appreciated the thoroughness with which you went about learning the technology)
Saying Jenkins can be configured in a day, to the degree that Stripe configured Kubernetes (with Puppet), is disingenuous. It would take more than a day to do the configuration management of the slaves, getting the right dependancies for all the jobs.
How to you isolate job executions in Jenkins? In Kubernetes each job inherently isolated in containers. In Jenkins you have a bunch of choices. Do you only run one executer per slave? OK, but then you have a bunch of wasted capacity some of the time, and not enough capacity other times. You could dynamically provision EC2 instances to scale capacity, but then you need a setup to bake your slave AMIs, and you have potentially added ~3 minutes to jobs for EC2 provisioning. You can run the jobs in Docker containers on the slaves, that will probably get you better bin packing, but it doesn't have resource management in the way Kubernetes does, so you could easily overload a slave (leading to failure) while other slaves are underutilized.
Doing Jenkins right is not easy, there are solutions to all the problems, but isn't just fire it up and it works.
Stripe was running Chronos before, which is a Mesos scheduler. So they have experience with distributed cluster schedulers. They were probably comfortable with the idea of Kubernetes.
They mention this as a first step to using Kubernetes for other things. So they probably wanted to used Kubernetes for other things, and this seemed like a low risk way to get experience with it. Just like GitHub started using Kubernetes for their internal review-lab to get comfortable with it before moving to riskier things (https://githubengineering.com/kubernetes-at-github/).
This is not true, all the configuration is scriptable via groovy scripts. We run bunch of groovy startup scripts that configure everything post launch. There is an effort to support this better[1] by jenkins team.
> How to you isolate job executions in Jenkins? In Kubernetes each job inherently isolated in containers.
We run one docker container/build on docker swarm. Each build gets its own isolated/clean environment. There is no EC2 provisioning ect. We already own and maintain docker swarm setup we just run jenkins/jenkins agents on it. I assume if you are using kubernetes it would be similar setup.
> Jenkins is a single point of failure, is isn't a highly available distributed scheduler.
I agree with this to an extent. If you are running jenkins on scheduler it can be rescheduled but you inflight jobs are dead.
1. https://github.com/jenkinsci/configuration-as-code-plugin
Bingo! thats the point, its a cron replacement.
But to tackle your first point, K8s might be distributed, its not inherently reliable. Yeah sure people run it in production, but there are a myriad of bugs that you bump into. I've lost clusters due to tiny issues that ran rampant. Something that I've not had in other cluster or grid engine systems.
if we are talking AWS, then having the jenkins master in an auto scaling group with decent monitoring sorts out most of your uptime issues,
The reason I say it'd take a day to configure jenkins is because the jobs have already been setup in cronos. It should literally be a copy-pasta job. All the hard work of figuring out which jobs are box killers, which can share, which are a bit sticky has been done already, all thats changing is the execution system.
What level of isolation are you after, and for what purpose? if jobs can't live on the same box, then thats almost certainly bad job design. (yes there are exceptions, but unbounded memory or CPU usage is just nasty.) There maybe need for regulatory isolation, but containers are not currently recognised as isolated for that purpose.
The author made clear multiple times that they were using cron jobs as a test bed for Kubernetes, and they chose to “overengineer” because they’re looking to use Kubernetes for more and more of their needs over time. You’re kind of arguing against a straw man.
I think it’s actually a great example of how Stripe thinks about technology choices.
They’re interested in choosing fewer tools that are better built and can grow to solve more needs. And they’re evaluating tools not just by “time to complete X random project”, but by other longer-term heuristics like maintenance levels. And the best way to do that is to start using the tool for a single need, investing more time in learning/research than is required for the need itself—ensuring that it really is a solid, foundational solution—with the understanding that you’re choosing technology for the long run. Then continue to expand your use of the tool over time, reaping benefits on your initial time investment.
At the point where you have to fix upstream bugs, its the point where one says: fuckit, its not stable enough, more trouble than its worth. Lets use gaffer tape and move on. As for maintenance, without company buyin for transplanting the _entire_ stack, its questionable. And if there are only two people, and you have to maintain an entire distributed stack, that smacks of pain.
One company, one platform.
However, k8s does more than just scheduling where pods run. It also ensures that they run with the correct security and availability constraints. When you add in things like affinity (don't run this job on the same machine as that job, or, only run jobs for this tenant on nodes assigned to that tenant), storage management (connect this job to this volume), networking (only let this pod talk to this service and the monitoring layer, don't let anyone connect to the pods running the job), and much, much more.
Yeah, you can do that with jenkins, or like, just cron. I know, because I did it for 18 years before I had ever heard of Kubernetes.
But, just like I can reach for Django or Rails or whatever it is that Java programmers use these days to build my web application, I can lean on Kubernetes to build my infrastructure.
I estimate that leveraging GKE has saved me in the range of $400k in direct employee costs, not to mention time-to-market advantages. As we grow, I expect that number to go higher.
I'm very sympathetic to the view that jenkins, or something comparable, is viable and cost effective for a lot of shops if you're looking exclusively at direct project costs.
As you've pointed out, though, as a building block of Enterprise software the ability to scale out in, and across, multiple clouds consistently is an economic and development boon so powerful I don't think one should really be looking at k8s as just a microservice/deployment platform: it's a common environment-ignorant application standard. Picking and choosing per service whether you should be hosting in GKE, AWS, or on-premise, applying federated clusters, recreating whole production environments for dev... It's a gamechanger.
It's totally possible to fire up a new Jenkins solution in EC2, but as of a few weeks ago Kubernetes is click-and-go in all three major cloud providers. It totally reshapes how we're looking at development projects with suppliers, testing, etc, as we can create fictionalized shared versions of our production environment for development, integration, and testing. As an emerging industry wide standard we can demand and expect Kubernetes knowledge from third parties in a way a home-brewed Jenkins setup could never match.
Yeah I was using my CI system to handle the CD constraints and it was so straightforward it hardly registered as work. I was setting up one build agent with a custom property and all the builds that couldn’t run simultaneously would all require an agent with that property. So they just queued in chronological order of arrival. Done. Next problem.
Remember - if there's a one in a million chance of a collision, it'll happen by next Tuesday.
You provide a scalable infrastructure underneath your jenkins install while not dealing with the issue of node/agent allocation. Plus, you get kubernetes if for your not-so-simple crons.
Has anyone used Airflow for cronjobs? is it a good idea or a terrible one?
With a managed Kube offering, setting up Kube is much much easier than this jenkins setup you are suggesting. And, there's no overhead charge. Why would anyone go through the hassle of manually provisioning machines like you suggest when AWS/GCP will do it for you?
Its overkill in the same way using DynamoDB for something that only experiences a handful of writes every day is overkill; who cares? The scale is there if you need it, but it doesn't cost anything to not use it.
From my experience, the hard part kickin when dealing with stateful service which needs to associated with volume.
Even with a managed cluster, you still have to solve that problem. Either you pre-provision disk or use dynamic volume.
Next is when upgrading K8S version. with a stateless service, it's a walk in a prt to upgrade. With data volume it's more tricky to upgrade because you want to control the process of replacing node and want to ensure the data volume get mounted and migrated to new node properly.
Thing get harder especially with stuff like Kafka/ZooKeeper when pods get remove and the re-balancing happen.
In other words, managed Kuber actually offer not much. You still have to be carefully planning and it isn't magically solve all problem for you.
For 95% of people, I'd say going with the managed version is the right choice.
However there are some reasons why you wouldn't use a managed service. If you need a custom build, custom drivers, etc.
With a little work you could expand that out to make a travis equivalent using the same code base.
Also remember this is Stripe, and they like to advertise through Engineering blogs (and they do that quite well to be honest).
I'm getting cynical here, but I'm sometimes wondering if they didn't specifically chose a cool shiny tool, so that they can speak about it (and advertise through blogging)
For some reason Nomad seems to get noticeably less publicity than some of the other Hashicorp offerings like Consul, Vault, and Terraform. In my opinion Nomad is right up there with them. The documentation is excellent. I haven’t had to fix any upstream issues in about a year of development on two separate Nomad clusters. Upgrading versions live is straightforward, and I rarely find myself in a situation where I can’t accomplish something I envisioned because Nomad is missing a feature. It schedules batch jobs, cron jobs, long running services, and system services that run on every node. It has a variety of job drivers outside of Docker.
Nomad, Consul, Vault, and the Consul-aware Fabio load balancer run together to form most of what one might need for a cluster scheduler based deployment, somewhat reminiscent of the “do one thing well” Unix philosophy of composability.
Certainly it isn’t perfect, but I’d recommend it to anyone who is considering using a cluster scheduler but is apprehensive about the operational complexity of the more widely discussed options such as Kubernetes.
With the velocity of k8s it's hard to imagine how Nomad could catch/keep up. K8s has operators, Helm, etc. That just means you can add battle-tested components off the shelve with a single command. So, less wheel-inventing and boilerplate writing to do for us.
With the backing of so much larger community/entities it also feels like I’m less likely to be the first one to discover a new bug. RedHat or Google or one of their customers will have hit and fixed it already, and my production platform keeps humming along nicely. K8s has just had more flytime and exposure to crazy environments and workloads, so more kinks are going to be ironed out.
I always did like the “do one thing right” unixy approach of Hashicorp’s toolset, and that you can pick the pieces you like. But (sadly for them) that means I can now pick Vault or Consul and run it on top of Kubernetes (re-using k8s' internal etcd is not recommended) if I wanted. I'm actually not overly sorry for them, seeing as how they're locking up more & more features behind enterprise products. I haven't checked in a while but wouldn't be surprised if they also had a Nomad Enterprise already. Nothing wrong with HashiCorp wanting to make money, but if there also is k8s without those restrictions..
Kubernetes seems to be a lot of magic and NIH and tries to do everything itself, whereas Mesos and Nomad are nicely composable and easy to reason about.
Nomad's biggest benefit for me is a very nice integration with Vault (and Consul), I can have Nomad ask for a container instance specific secret which Vault then goes and generates and later immediately revokes once that container dies. Maybe this is possible with Kubernetes but I have not seen anything that tight yet.
IAM instance profiles are nice but they are instance wide, but having each container a unique, short lived and properly scoped set of secrets injected at the last possible time and immediately revoked afterwards makes me feel all warm and fuzzy inside.
Not heard that criticism before, what are you referring to in particular? The NIH part seems incongruous to me, since Google were a major contributor in inventing warehouse scale computing and cluster schedulers (c.f. the Borg and Omega papers, etc.).
I would have to put so much effort in convincing customers and management to not go the (now almost default?) Kubernetes-route, that it's risky trying something else. A small hiccup in Nomad, would be enough for the pitchforks to come out.
The biggest benefits seem to be
(1) simplicity, but GCE and minikube are easy enough to learn in a day and
(2) ability to run non-containers, but docker containers are generic - they can run java apps just fine.
Nomad is operationally simple, you can run it out of your normal devops roles, you don't need dedicated staff. Mostly because you can pretty easily wrap your head around what it does and how it works.
This saves you bundles of cash and time.
Additionally, in the early days there were some tools missing (like online modifying the raft peer members) that are all there now.
Running in production and very happy with it!
Though Chronos has a release recently with a bunch of fixes, Mesos is inevitably fading as a legacy platform.
Because of Chronos? This is a bizarre thing to say. Mesos actually works extremely well. Whenever I ask the why kube over Mesos question, I never get a good answer. I think because people just don’t know Mesos. Also it wasn’t made by google.
This is likely related a set of Kubernetes bugs [1][2] (and grpc[3]) that CoreOS is working diligently to get fixed. The first set of these, the endpoint reconciler[4], has landed in 1.9.
More work is pending on the etcd client in Kubernetes. The good news is that the client is used everywhere, so one fix and all components will benefit.
[1]: https://github.com/kubernetes/community/pull/939 [2]: https://github.com/kubernetes/kubernetes/issues/22609 [3]: https://github.com/kubernetes/kubernetes/issues/47131 [4]: https://github.com/kubernetes/kubernetes/pull/51698
Also, any large scale system like Borg developed at a large company like Facebook or Google will have completely opinionated one-way-of-doing-things for a lot of aspects. This doesn’t work for the world outside where lots of developers from different backgrounds, lots of projects with different requirements exist.
The implementation is effectively entirely from scratch, so bugs will exist.
A lot of traditional financial instruments 1) are not resilient to failure and 2) run at fixed times in batches. I’m confident it’s not their own systems that set the requirement of rigidity.
Their customers expect the cron jobs to run when they expected and how they expected.
With that constraint restarts look a lot less acceptable.
I chose Nomad because I'm already using Consul and I wanted to run raw .Net executables. Would it have been worth it to use Docker with .Net Core?
Not trying to change my infrastructure now, but just curious about whether it is worth the time to play with it on the side.
Although k8s does seem to be designed much better. I use it personally too and hope for its success.
How does Stripe's approach differ?
Is there an advantage to one over the other? It looks like in both cases, you need a platform team (at least 2, maybe 3 people; we had a large complex setup and had like 10) to setup things like K8s, DC/OS or Nomad, because they are complex systems with a lot of different components .. components like Flanel vs Weavenet vs some other container networks, handling storage volumes, labels and automatic configuration of HAProxy from them (marathon-lb on DC/OS).
All schedulers (k8s, swarm, marathon) seems to use a json format for job information that's pretty specific, not only to the scheduler, but to the way other tooling is setup at your specific shop.