An in the thread he mentions Crossplane as the cross-cloud way to do this https://twitter.com/kelseyhightower/status/12963213771342315...
I also just generally love the thought of being able to manage any cloud resources at all via standard open-protocols & systems. Compounding the investment, rather than having to invest in a bunch of specific not-interconnected areas is going to lead to great things.
It'll be interesting to see what if any architectural flourishes or innovations went in to ACK's control loops.
Does it have functionality like Kube DB that makes a “dormant” version of the state store?
If you use something like that, why use K8S and not just use AWS services natively?
At the same time, you usually end up spending more money and having worse results when you don’t go all in.
As far as why use EKS vs ECS - the “native service”? They seem to have feature parity, ECS is easier to use for the unitiated. But, there are so many people who know k8s and your knowledge is portable.
Which brings up my second point. Most software engineers don’t care about cloud mobility as much as they claim. They care about career mobility. There is a much better chance that you will leave a company and move to a company on a different provider than your company will. I’m not saying it’s a bad thing to focus on technologies that give you as an individual the most optionality.
last time I deleted a cluster it failed because of a NLB still being around but not accounted for as CFN resource, even though I provisioned the cluster with CFN.
If a CRD is deleted the CRs described it are also deleted. So, deleting a CRD (even accidentally) could end up deleting resources in AWS (e.g., backups). So, be careful.
Some things being managed by Kubernetes would be really cool. Other things being managed by k8s could break things if something goes wrong. I would plan accordingly.
Our goal is to make this project have "no surprises" and therefore no unexpected destruction of resources. The specifics of how we mark resource as safe to delete instead of retaining by default are under discussion on that Github issue.
The stateless stacks generally have a lot of development activity going on, and rapidly iterate. This is where most of our code and logic lives. This is where the vast majority of our deployment (and related cloud configuration) activity happens.
All of that thrash is kept away from the stateful stacks - think S3 buckets or DynamoDB tables - where, if THOSE thrash, we potentially get an outage at best, or lose data at worst (backups notwithstanding).
We DO NOT WANT stateless oriented stacks to own the lifecycle for stateful stacks. They inherently need to be treated differently. Or, at least the impact of mistakes is different.
The trick comes when you need to tie them together. To do this, we've added CloudFormation hooks and other deployment time logic that publish ARN and other connectivity info to our configuration store. The stateless services look up config values either during deployment or at runtime and are able to find the details they need to reference the state resources they need access to.
We've poked at toolsets like Amplify that lump everything together and have already been bitten numerous times. We've found that the difference between stateful and stateless resources should not be papered over, but instead emphasized and supported explicitly by tooling.
... all of this being one team's experience over the years, of course.
Very curious to see how this paradigm evolves here!
[edit]… Riffing on this just a little bit further… as I’m thinking about it here, it comes down to abstraction level. In a deployment or resource management domain, a generic “this is a cloud resource” isn’t very useful. What’s way _more_ useful is something like “this is a stateful resource” or “this is a stateless resource”, because that level describes resource behavior more clearly, AND how to interface with or manage those resources.
There are echos of code development principles here intentionally - robust cloud infrastructure management is mirrors software dev practices as much as infrastructure management ones!
In this case the delete will appear to succeed, but the recreation, if done with the same name, may fail.
Of course we have to try this because it's a badass (tho obvious in hindsight) idea, but in practice it might have some downsides.
This AWS project will need to support a feature like that.
Assume anyone can destroy your infrastructure at any time (by mistake or otherwise). This could be done with cloudformation, terraform, API calls, and essentially any automation(with different levels of safeguards).
Be prepared for that. Be careful with your data. Not so careful with individual servers - they should be cattle, not pets.
EDIT: If this is a production system, one could take away any 'delete' permissions until they are needed again.
One layer of defense in all of these cases is keeping the IAM credentials that the configuration management tool uses from having any deletion permissions.
Also, set the DeletionPolicy in CF to false.
Sadly startup interest changed and then went under (but the freedom I was given to explore there was the best experience I have ever had)
Pass, I'll check back in a year.
On the other hand my brain segfaults on the recursive loop of how the layer-inversion gets modeled as IaC with a CI/CD pipeline. I guess if you were very strict about having your provider-infra layer (cloudformation/terraform) do only the bare minimum to get your kube environment up, and then within that kube environment you used something like ACK to provision any cloud-provider resources that your kube-managed apps/pipelines needed.
Yet another case where I'm like "I don't know if kube should be the answer to everything, but I sure as shit won't miss <x>".
And now there is a control loop. So if Ben in support accidentally deletes my queue, it'll get recreated.
Breaking away from the proprietary platform underlay is going to be great. Managing things more consistently is going to be really great.
Kubernetes introduced edge-triggered level-driven with resync reconciliation based "controllers". User defines a state and the controller does it's best to keep the infra in this desired state all the times. (Although Terraform has also moved to this same design in recent times)
This establishes a consistent experience. Everyone knows you just need to do kubectl get my-resource to check your desired state. All the issues will be logged in status and controller. You can combine multiple controllers to achieve your desired application design. For example, Knative has their own kind called "Service" which has some custom components, some inherited from istio and things like replicasets from default kubernetes controllers.