Most importantly: I want a lot fewer moving parts than it currently has. Being "extensible" is a noble goal, but at some point cognitive overhead begins to dominate. Learn to say "no" to good ideas.
Unfortunately there's a lot of K8S configs and specific software already written, so people are unlikely to switch to something more manageable. Fortunately if complexity continues to proliferate, it may collapse under its own weight, leaving no option but to move somewhere else.
This setup is very, very simple and scalable. There is very little to gain IMO on moving to Kubernetes.
Consul, VSphere and load balancers have APIs and you can write tools to do everything that K8s does.
In some networks DNS failover is really not that great, so at least a virtual ip needs to be used.
Many of the major infrastructure/platform vendors are rolling out their own distribution of Kubernetes either as a cloud service e.g AWS, Azure, GCP or on premise e.g. RedHat.
So I suspect they are going to try and differentiate on features and ease of use and make it as hard as possible to move anywhere else.
One thing I've really appreciated, was how one could enable/disable things based on the binary version rolled in, and if it's rolled back the state goes back.
Basically something like this:
{
new_exp_feature = (binary_compiled_after_changelist( 123456789 ) || binary_compiled_with_cherrypicks( { 123456795, 1234567899 } )
}
Since piper is changelist based (like perforce/svn), each "CL" goes up atomically, so you can use this to say - this specific flag should get turned ON only if my binary has been compiled with base CL > 12345789 or if it was compiled with earlier, had these cherrypicks (e.g. individual Changelists) built with it. But this was heavily integrated with the whole system - e.g. each binary would basically be built at some @base_cl and additional @{cherry_pick_cl1, chery_pick_cl2, ..} maybe applied. For example the team decides to release with verison @base_cl, but during the release bugs were found, and rather than rolling to a new @base_cl, just individual cherry picks maybe be pushed - so basically you can then control (in your configuration) how to act (configuration could be pushed indepedntly of your binary, ... though some systems would bundle them together)... And then if you have to rollback, the Borgcfg would re-evaluate all this, and decide to flip the switch back (that switch would simply emit something like --new_exp_feature=true or --new_exp_feature=false (or --no-new_exp_feature, it was long time ago so I could be wrong)).With git/hg - you no longer have such monotonic order, but also that monotonic order worked best with monorepos (or maybe I'm just too narrow-sighted here)...
All of this seems way more complicated than the tools we use at my company. Is there a specialized need here I’m not seeing?
The evaluation rules are merely a borgcfg artifact.
Disclaimer: I maintain borgcfg.
There's still a learning curve, but it's much more humane than Kubernetes.
Kubernetes is very simple. And it will become much more complex with the growing hardware, network, and applications it's trying to manage.
What's missing is that there is a layer of complexity on top of k8s are still left for figuring out. And I think the operator's pattern is the right abstraction for service jobs. Some kind framework is still needed to handle the batch/offline workloads though.
There’s some work going on to have something more user-friendly (think Google’s Piccolo) - https://github.com/stripe/skycfg (disclaimer - I contributed to this project)
Have you tried Nomad?
actually I used kubeadm and the higher the version was going the better it worked for major upgrades.
At the moment with the new master upgrade methods I did not have any problems so far. on two clusters.
Sadly I created my cluster with an "external" etcd, beside that it is internal and also tried to maintain my own certificates, which is now a pita. (at the time cert handling wasn't as good in kubeadm as it is now).
Also I have a CloudConfig/Ignition Config creator which can bootstrap all necessary configs to bootstrap a kubeadm cluster on ContainerLinux/Flatcar Linux. So if I really have time I can just recreate a new cluster and move everything over. (I.e. the only thing which is problematic in "moving" over is the database created with kubedb)
Also you can use keepalived as your kubeadm load balancer.
Also I'd only go with a managed k8s solution and I'm not sure I'd consider k8s for older or non-microservice/containerized architectures. In the later case though I don't think there's anything better out there in terms of orchestration.
I suggest every startup use a hosted k8s solution, which takes care of most things like authentication, networking, monitoring, updating, etc.
also keep away from templating system such as jsonnet which is a huge overkill. you will end up writing a lot code you will hate to read later. instead write your own yaml builder in CI, together with parts that do docker image building, and code that deploys the microservices
imo Google did a really smart move with open sourcing k8s, as a latecomer of cloud provider. now infrastructure become so insignificant since everything runs on docker and pods.
The first disappointment is setting up a local development environment. I failed to get minikube running on a Macbook Air 2013 and a Ubuntu Thinkpad. Both have VTx enabled and Docker and VirtualBox running flawlessly. Their online interactive tutorial was good though, enough for the learning purpose.
Production setup is a bigger disappointment. The only easy and reliable ways to have a production grade Kubernetes cluster are to lock yourself into either a big player cloud provider, or an enterprise OS (Redhat/Ubuntu), or introduce a new layer on top of Kubernetes [1]. Locking myself into enterprise Ubuntu/Redhad is expensive, and I'm not comfortable with adding a new, moving, unreliable layer on top of Kubernestes which is built on top of Docker. One thing I like about the Docker movement is that they commoditize infrastructure and reduce lock-ins. I can design my infrastructure so it can utilize an open source based cloud product first and easily move to others or self-host if needed. With Kubernetes, things are going the other way. Even if I never moved out of the big 3 (AWS/Azure/GCloud), the migration process could be painful since their Kubernetes may introduce further lock-ins for logging, monitoring, and so on.
I think you might have misunderstood that page. The standard and universal way to deploy Kubernetes on to either your own bare metal or any cloud provider is to use kubeadm. However, if you would like a simpler and more automated solution and/or one backed by a vendor, you are welcome to pick any of the hosted platforms, distributions, or installers. CNCF has certified 70 conformant solutions: https://www.cncf.io/certification/software-conformance/
> Even if I never moved out of the big 3 (AWS/Azure/GCloud), the migration process could be painful since their Kubernetes may introduce further lock-ins for logging, monitoring, and so on.
If you choose open source solutions for logging and monitoring like Fluentd and Prometheus, then you can avoid locking into anyone's value added services and remain completely portable. If you decide to go with a vendor's solution, you may trade convenience for higher switching costs.
[1]: https://kubernetes.io/docs/setup/pick-right-solution/
Disclosure: I'm executive director of CNCF and run the conformance program.
Take AWS EKS as an example. Their feature page[1] does mention conformance. Then it mentions 20 other non-conformance focused features that create an effective lock-in.
k8s is becoming like OpenStack in this regards. You need to embrace a vendor version of k8s in order to have a functional cluster without a massive team.
If someone could point me to an article explaining why k8s is so much better than swarm, I'd really appreciate it. Are the big advantages only at 100-node scales?
It's quite simple for a 20 year SA to stand up a highly integrated environment with modular monitoring, directory service, virtualization and hybrid cloud options for all services in a week. Why don't you hire one of these for the job instead of recipe/containering yourself into 'doesn't work, I dunno' posts.
Because every.single.one of these "integrated environments" I've ever come across was an objective mess, poorly documented and littered with tech-debt.
It was clear the "20-year SA" had forced 20-year old administration abstractions and ideas on top of modern infrastructure and application concerns. It was cheaper/better/easier to throw it out and rebuild on something like k8s than to make any attempting at "scaling" the existing solution.
You're simply trading the "in-fashion" disease for the "I'm a 20-year Linuxbeard I know best and no one tells me different" disease.
Why is it better to spend money to rebuild a fraction of K8S with a patchwork of infrastructure put together by a single person?
The kubernetes folks describe a tentative solution to cloud lock-in here: https://kubernetes.io/docs/concepts/cluster-administration/f... OP isn't the only one with those concerns.
It would be nice when you can switch your cluster load from any of the cloud providers, or your own on-prem setup as you go. For instance, I could see people wanting to have a default small cluster on their on-prem setup, and be ready to scale on cloud when needed.
It used to be really clunky, but these days all you need is a simple bash script or ansible play (or whatever you’re comfortable with) to get going.
But yeah, no unix philosphy vibes from k8s as a whole...
Actually, I barely saw any Kubernetes cloud provider provides meaningful service which can lock myself in, they are basically managed Kubernetes clusters with their cloud services as plugins. You can verify this by comparing GKE/AKE/EKS, you'll find they are almost same thing.
Red Hat OpenShift on RHEL, Pivotal Container Service on Ubuntu, Red Hat’s nextgen CoreOS based Kubernetes, Canonical’s Charmed Kubernetes Distribution on Ubuntu, etc. all have different config management , install, upgrade, patching mechanisms that vary from Ansible, to Terraform, to BOSH, to Juju. Some handle PXE bare metal, some don’t. Etc.
There usually are free / no pay versions of the above that you can use self-supported, but then you’ll also need to coordinate your own upgrades and use community forums for q&a rather than being able to contractually have someone looking out for you and answering your questions.
If you’d prefer to avoid lock-in, All of that plumbing would otherwise have to be configured and scripted yourself with your chosen toolchain plus the newer “k8s small tools” like Kubeadm, Kops, Kube-spray, etc.
As the old saying goes, open source is only free (as in beer) if your time has no value.
I am a pretty basic user, I have started using k8s on this project as a learning and 100$ was too much for the learning price, but now on DO I get a similar cluster for less than half of GKE price and I feel like it is worth it, considering all the simplicity and observability of deployments. Also, DO allows me to select regions without any price difference, so I was able to select Amsterdam to get 10 times better latency from where I live. My setup is quite basic, my app with aroud 8-10 pods, + additional stuff such as cert-manager and prometheus.
YMMV, but so far I am really happy with DO's offering, both in terms of performance, simplicity and performance. I am not a power user and definitely operate at no scale, but using DO in general is much simpler than using GCP with GKE.
for me, if you’re in the cloud you don’t need k8s. your favorite cloud provider has already figured out logging and monitoring and the basic things you need to get going. (another story if you run on bare metal)
if you’re not running a legacy app you don’t really need containers either. containers are great for legacy apps, for poorly written software or if you like overengineering. the abstraction you need is called a vm. use it. (again if you are in the cloud).
your app/service/thing is not as complicated as you think it is (or at least it should not be). I see a lot of people feeling like they need to experiment with new technology, on the job, on whatever they are doing now. actually building something that works and is simple as fuck seems to take a backseat and these types of people will create a narrative around using the new flashy thing. this is how you end up with production systems leveraging tools in beta and you end up closing shop when you finally figure out that you don’t have the resources to understand and maintain what you’ve created.
there is a time and place to experiment and learn. on small projects or on your own time. it takes experience to understand the hype cycle and to distinguish good tech from the hype.
as for k8s? yes, it solves some problems but it also creates others. do you like basically spending the time you’ve saved on setup and deployment to maintain/troubleshoot/upgrade your cluster? knock yourself out.
It seems like most of the problems are actually about installing and running K8S software itself, but then 95% of companies won't be doing that and using the managed offerings instead. This is no different than companies using the cloud over running their own DCs.
I think a lot of the complaints against K8s are from the ops side of things. In my org, I don't actually run or upgrade the K8s cluster myself, so those pain points aren't mine to bear. When you're running your own k8s, the operational complexity of managing the cluster itself is not trivial and the change in mindset for traditional sysadmin types is a substantial hurdle.
My own take: K8s (or something very much like it) is absolutely the future, but the operational challenges of migrating to it at this time should not be ignored if you want to run it yourself and have existing ops experience. This will only get easier over time as tooling improves and sysadmins start seeing that this is the future they have to embrace.
Infrastructure level monitoring is also very easy. For example, if you're on Datadog, you flip KUBERNETES=true as an environment variable in the datadog agent, and you'll instantly get events for stopped containers, with stopped reason (OOM, evictions, etc), which you can configure granular alerting on.
Let's say you're in a service-oriented environment and you want detailed network-level metrics between services (request latency, status codes, etc). No problem, two commands and you have Istio [2]. Istio has Jaeger built-in for distributed tracing, with an in-cluster dashboard, or you can export the OpenTracing spans to any service that supports OpenTracing. You can also export these metrics to Datadog or most other metrics services you use.
[1] https://kubernetes.io/docs/tasks/debug-application-cluster/l...
I also have Prometheus + grafana, which similarly collects lots of stats from around the cluster, but I'm fairly sure I'm the only person who uses that dashboard, since the only things hooked up to Prometheus are databases and such, no internal applications (yet!).
Being able to aggregate stdout/stderr across dozens of machines previously would have cost either a lot of Chef setup time or a contract with some provider. Now I get a fairly straight forward open-source stack that can be refined over time, and the yaml re-used very easily in any cluster. Plus, the metadata collected from Kubernetes about each log line is extremely useful (For example, out of the box you can query by Kubernetes labels for your graphs etc)
I believe most people use an EFK/ELK stack for centralized logging and Prometheus for Monitoring.
I was able to set it up successfully a couple of times, with more or less time required. Last time, I gave up after four days because I realized that what I need was a "I just want to run a simple cluster" solution and while k8s might provide that, its flexibility makes it hard for me to use it.
What they need to do is hire some people who are great teachers, explainers etc.. Avoid people who rely on already attained technical knowledge, design patterns, algos etc.. to pattern match on new tech to instantly grok it. The 'noob' people who question the engineers who designed the tools and ask a ton of dumb questions about how it works so they can then translate it into everyday tutorial paragraphs.
Who is aimed at, app developers or platform operators? Clear, obvious contracts between the two roles are valuable, even if you decide to combine them.
I'm moderately hopeful that Knative will help in that regard, as it is more conclusively oriented towards the developer. But I am wary that since it leaves the implementation details completely visible, it may not achieve that goal.
Disclosure: I work for Pivotal, we have products based on both of these.
Definitely not the former. The YAML-based configuration is not a pleasant app deployment experience. Companies end up needing to do some sort of auto-generation for it to make it sane for app devs.
App developers want experiences similar to heroku. They want to git push and have applications safely roll out without downtime or configuration.
As an application developer, you probably also don't work with the Kernel and syscalls directly (anymore), so I guess you can expect higher abstractions and a smoother experience for Kubernetes in the future.
Just like backups, if infrastructure as code isn't tested, it's worthless.
this is a good one and have been thinking about this myself. Even with smallish projects that might have 10s of container based applications / services. In my case I end up with what is essentially a 3 tier architecture with each tier being a group of containers/machines with their own rules for startup/shutdown.
Unless you have a truly circular dependency, at which point they probably should be collapsed into a single service.
1) no matter what I think I know, there's too many dark corners to create an adequate course
2) K8S is such a dumpster fire that I shouldn't encourage others
3) there's a hell of an opportunity here
Thoughts? Worth pursuing? Anything in particular that should be included that usually isn't in this kind of training?
Best way to think of Kubernetes is that it was designed to be a successful open source project that was widely adopted as a standard foundation to build products. It wasn’t designed to be a useable product on its own.
We are at the equivalent stage of Slackware and SLS and Debian Red Hat pre-1.0 stages of GNU/Linux distros circa 1994. Red Hat eventually ran away most of the money by the late 90s, but in the meantime, lots of opportunity to fill an unmet need.
Writing an ok tutorial isn't good enough. Writing an amazing tutorial is fine, if it is on a platform people know (such as LinuxAcademy, Pluralsight, or something similar).
I once wrote an article on getting started with a static website generator. I received a ton of praise in the comments, saying how great the step-by-step instructions are, and I felt great... Only to discover that I made a typo in one of the commands, and that if you actually went through the tutorial, there's no way you'd get past that one step, unless you knew what you were doing (in which case you wouldn't go through a getting started guide, most likely).
All I'm saying is, unless you can write an amazing content on a platform where people go to learn and advance their career, no one's gonna use it, I'm afraid.
As a bonus, show how to use Gitlab for deploying and managing the app. Gitlab + Kubernetes could be the holy grail for modern, self-hosted development, however a good, complete tutorial/documentation is very hard to come by. One has to pick the pieces from a lot of different places with sometimes conflicting information.
I'd happily pay 100 Euros for such a course.
There is an opportunity for anything infrastructure related.
For the majority, it just adds a little value when you compare to added complexity to infrastructure and the cost of a learning curve and the ongoing operation and maintenance.
What's the alternative? We spin up VMs, templated with AMIs, provisioned with an ASG? That works fine. But we want centralized logging. We want graceful restarts. We want automated rollbacks. The list goes on. These are not Google scale desires, these are "cost of doing business" asks for any cloud company. You can start building all of this on that core architecture of AMIs, or your cloud provider's equivalent, but all you're going to do re-invent what Kubernetes does, probably worse.
Kubernetes' problem isn't that it solves problems most companies don't have. The problem is that these problems most companies have could be solved in a simpler way than Kubernetes, because most companies have the exact same problems.
Actually most medium to large companies do have this problem.
There are often a lot of different languages, libraries, versions, deployment methods etc. And the appeal of Docker was that you can treat them all as block boxes. And the appeal of Kubernetes is that you have this rich support infrastructure to run them all hands-off at scale.
It definitely solves a problem. Just not particularly well.
Kubernetes "done right" is almost a part of your application. It becomes this "machine" that you throw stuff into and good stuff happens.
You'll need a team to integrate it into the pieces you require (auth, secrets, loadbalancers, permissions/app identities, monitoring and logging) but many places lack bits and pieces, and in my opinion k8s gives you a fast track to create a uniform application delivery platform.
What I don't like is that it kind of is the opposite of "the unix philosophy" and in that regard I prefer the hashicorp stack.
Probably because it started at Google, if it was created by IBM then we'd only hear about it on TV ads.
They run a self-hosted OpenShift cluster, which is managed internally by a team of 4. Not only makes this situation it a lot easier to spin up new environments, it also forces devs to include the ops team from the start for stuff they don't know, and corrections can be made early on.
The adoption failures are mostly networking issues specific to their cloud. Performance and box limits vary widely depending on cloud vendor and I still don't quite understand the performance penalty of the different overlay networks / adapters.
Consider a traditional monolithic application. In comes your HTTP request in one end, a bunch of cross thread communication happens, and database queries come out the other end. With that, you have 2 points of network communication.
Now with a micro-service, you might have 4 or 5 applications that are needed to replace the above monolith. Throw in a service mesh on top of your cloud providers SDN, you've turned 2 points of network communication into 20 or more. The 5 micro-services talking to each other and the service meshes talking to each other. Add on top the additional processing overhead of maybe 1 to 2ms, you've just added at best 10ms round trip time to get to your databases and some more CPU. And to what benefit? TLS? You can do this in your application, or trust your private network is private. Tracing? You can do this with PID matching and watching the kernel's networking stack.
For what I do, in theory, many things should not impact results. In practice, anything that upon measurement impact results is stripped away. Think A/B testing but for every single component - including the major version of say the python interpreter.
That's how you end up running many things baremetal.
I'll say the future is not serverless but cloudless
Defense in depth exists for a reason.
Do you have any pointers/write-ups with more information or plans in this direction? I would be interested to learn more.
https://kubedex.com/google-gke-vs-microsoft-aks-vs-amazon-ek...
It's pretty easy to dig up speed tests of the overlay networks, but a lot of these are just rating userspace overlay networks. The new hotness is the plugins provided by the cloud vendor which integrate with their SDN, and I haven't seen a good benchmark for those yet.
Most interesting reading will be to look up managed kube networking plugins on github and look for open/closed issues with lots of stars.
I’m setting up a lambda test right now so I find it perfectly timed!
Of course it's 2019 and you have to migrate Hadoop to run on k8s now :)
My impression is that if you are a small shop and have the money, use k8s on google and be happy, but don't attempt to set it up for yourself.
If you only have a few dedicated boxes somewhere just use Docker Swarm and something like Portainer.
* About half of the post-mortems involve issues with AWS load balancers (mostly ELB, one with ALB) * Two of the post-mortems involve running control plane components dependent on consensus on Amazon's `t2` series nodes
This was pretty surprising to me because I've never run Kubernetes on AWS. I've run it on Azure using acs-engine and more recently AKS since its release, and on Google Cloud Platform using GKE; and it's a good reminder not to to run critical code on T series instances because AWS can and will throttle or pause these instances.
I really liked/missed the beauty of simplicity in marathon that everything was a task, the load balancer, autoscaler, app servers everything. I think it failed because provisioning was not easy, lack of first-class integrations with cloud vendors and horrible horrible documentation.
Kind of sad to see it lost the hype battle, and since then even Mesosphere had to come up with a K8s offering.
for f in /.yaml ...
with a directory structure of:
drwxrwsrwx+ 1 root 1002 176 Jan 20 21:15 .
drwxrwsrwx+ 1 root 1002 194 Nov 17 20:06 ..
drwxrwsrwx+ 1 root 1002 68 Jan 20 20:50 0-pod-network
drwxrwsrwx+ 1 root 1002 104 Nov 1 11:18 1-cert-manager
drwxrwsrwx+ 1 root 1002 34 Jul 11 2018 2-ingress
-rwxrwxrwx+ 1 root 1002 93 Jan 20 21:15 apply-config.sh
drwxrwsrwx+ 1 root 1002 22 Jul 14 2018 cockpit
drwxrwsrwx+ 1 root 1002 36 Jul 3 2018 samba
drwxrwsrwx+ 1 root 1002 76 Jul 6 2018 staticfiles