The biggest complication is around Ingress and load balancers. https://github.com/kubernetes/ingress/issues/23
https://github.com/kubernetes/ingress/issues/17
The problem is that many people point towards kube-the-hard way.. which is NOT production ready. For example scaling ingress itself - do you do deployments versus daemonset? How do you set up ingresses to pass through source ip.
a lot of this is taken care for the cloud...but NOT for bare metal.
[1] https://groups.google.com/forum/m/?utm_medium=email&utm_sour...
https://github.com/kubernetes/kubernetes/issues/27343
I was writing a blog about installing it on bare metal, but this issue got me blocked.
Disclaimer: I'm on the team that works on this.
It is a lot easier on the public cloud, or easier still on a managed service.
The Kubernetes on baremetal setup above is for full automation. You can do simpler manual installs with bootkube[1] or kubeadm[2]; and this will further simplifying over the next few months.
[1] https://github.com/kubernetes-incubator/bootkube
[2] http://kubernetes.io/docs/getting-started-guides/kubeadm/
I've been keeping an eye out waiting to feel more comfortable to use and support a kube cluster on bare metal.
In our experience setting Kubernetes itself up for the first time is not that hard, the difficulties come from the fact that k8s is a fast moving target with quickly evolving "best practices". Also, companies are struggling a bit integrating k8s-centric workflow with existing applications, particularly data stores.
Please join up at the SIG to help out! https://github.com/orgs/kubernetes/teams/sig-cluster-lifecyc...
Disclosure: I work at Google on Kubernetes.
Achieving HA isn't necessarily complicated. Etcd supports clustering by default and the master components have master election builtin.
We also use haproxy locally on every machine to loadbalance between the different api servers. So we don't need a central LB.
Setting up HA was only a small part of our overall effort. Making things robust and figuring out the small details are a lot harder.
FWIW, HA is complicated in general, K8s HA is actually reasonable by comparison.
Other tasks, like upgrade and security are harder. We're working on docs in the SIG Cluster Ops and you're welcome to join in discuss with us there.
There will be a 1.5 compatible release coming soon.
Disclosure - I do work with the fine folks who made this. But I have had the opportunity to use it myself, and have successfully brought up some small clusters to play around with. I'd say it's certainly worth a look.
They support bare metal providers, like Packet, but they also support inexpensive platforms, like Digital Ocean.
Really handy if your goal to experiment with an up and running cluster and don't want the hassle of installation.
This works on GCE, AWS, Azure, VMWare, bare metal (via MAAS) and LXD containers. We're currently supporting 1.4 but will support 1.5 (and upgrades) in the next week or so.
Disclaimer: I'm on the team that works on this.
In the main kubernetes git repo, I see 39 non-merge from @canonical commits, 99% of which are under cluster/juju/* and 1 doc fix for juju, so not really a value add to Kubernetes at all. How can I, as someone deploying kubernetes on bare metal, trust you to manage a difficult project when Canonical doesn't seem to contribute much at all to the project or ecosystem?
It is great that you're making kubernetes easy to manage and deploy, but other than ease of use (which GKE does an excellent job of as well from the authors of k8s), what is Canonical's actual value add for paying you to manage k8s? Sorry for the terseness, but I'm genuinely curious what value add there is here. As a purely tech focused person, I simply don't see it. Even other firms doing exactly what you do (Apcera) have actual code change commits. I don't see a single one from Canonical that does.
GKE is great, but GKE is Google only. It's not on-prem, it's not cross cloud, and it's not portable. That's important to some people. Our contributions to cluster/juju is the distillation of our operational knowledge in running Kubernetes everywhere. The same upstream k8s, deployed with the same tooling, everywhere.
Not all value can be measured in commits :)
EDIT: My first response might of come off as rude, so I fixed it.
We really want these projects to have vibrant ecosystems. Canonical (and others) is contributing to that and helping users build solutions based on the existing code base. Adding features is great. Adding users is equally important.
But there's nothing stopping you from enlisting Azure PV's as a resource, Ceph managed PV's, and other incantations of durable storage. I only ask that you really consider the cost/benefit of each, and pick what makes the most sense to you.
My thoughts would be to use the azure PV disk type, and if that's not dynamic enough to meet your needs, then enlist ceph + large volumes and carve those up into RDB's to share among your workloads.
I'm sure there are others with differing opinions, and I'm happy to help you work through them (but not on a HN comment thread)
Seek me out in Slack as @lazypower, or ping me on the Juju IRC channel irc.freenode.net #juju I'm @lazypower there as well.
And finally, our juju user mailing list is another great resource for supporting questions like the above:
juju@lists.ubuntu.com
We have plans on including other storage vendors and mechanisms, but they aren't on the roadmap for the very near term. If end users start requesting a specific storage solution, it would go in our planning doc and get added to the roadmap. We're quite active with our early adopters that give us feedback and file bugs/requests.
To date you could continue to use alternative storage providers such as NFS or Gluster - but we don't have the PV creation+enlistment captured in the charm code just yet, but again, due to priorities. End users pretty much set the priorities for us, and we then circle back with some light weight planning and execution.
Hope this helps!
https://github.com/kubenow/KubeNow
Tries to re-use as much as possible from great projects like Terraform, Packer, Ansible and the kubeadm tool, and just add a thin layer on top of that (less risk for bit rot), which is an approach that seems appealing to me.
https://cloud.google.com/container-engine/docs/preemptible-v...
Disclosure: I work at Google on Kubernetes.
Disclosure: I work at Google on Kubernetes.
This recent discussion had more comments on that: https://news.ycombinator.com/item?id=13085941
A few things to keep in mind:
These maps are service centric, and abstract units as vertical columns in their respective diagrams. Services must be HA to be considered “production ready”
Additional concerns that may/may-not be represented here:
- TLS Security on all endpoints - TLS Key Rotation in the event of compromise/upgrade/expiration - Durable storage backed workloads - ETCD state snapshots for cluster point-in-time recovery - User/RBAC - this still needs more info before i can outline it (time limited) - Network policy for namespace/application isolation (this is an unspoken requirement for many business units)
The diagrams:
Kubernetes cluster services https://docs.google.com/drawings/d/1U4GBSg9Sdn7JspoxDyA4qwGM...
Kubernetes Binary Services Topology map https://docs.google.com/drawings/d/10sXtgdelUI3GbWjrYh2z5vhF...
Kubernetes Cluster node Maps (3) https://docs.google.com/drawings/d/1x1PEE0RKvCRnP5JCAjmfbr_7...
We left off working on a Network draft diagram, and if you’re interested in contributing/participating in this process, join us in the #sig-cluster-ops slack channel. We meet thursdays (or have, new year schedule dependent)
It looks like this is going to be set to false for v1.5.1: https://github.com/kubernetes/kubernetes/pull/38708
Disclosure: I work at Google on Kubernetes.