Turns out the uptime of both our git server and even the Jenkins instance beat GitHub by far and while the former only cost a marginal amount of CPU time on infrastructure I was running anyways, GitHub is a noticeable expense for us.
Of course it still saves me from the panic attacks every time I'm compelled to press the "Update now" button in Jenkins because either I do nothing and get my instance RCEd or I do press the button and who knows what plugin update will break which part of our setup, but while that was a constant fear in my mind, the amount of downtime caused by Jenkins plugin updates was zero whereas what GitHub is doing lately is way, way, way worse than zero.
I'm starting to get frustrated and like I presume many other paying users, I think I'm at a point where I feel like we should get partial refunds of our subscription money given the very spotty uptime all year now.
Never expose Jenkins to the public internet, make sure it's via VPN. If you need webhooks, there are services for that which allow you to broker webhooks whilst calling in from the Jenkins side (i.e. not exposing ports). Even so if you do have to use native webhooks, at least lock it down to the upstream's IP range(s).
Ideally have a dev jenkins to test all the things first before hitting upgrade on your prod instance and killing some plugins (hell even better if it's all IaaC and can just spin up a jenkins host per env, but ££$$$££/Time etc)
And GitHub's SLA is not great to begin with: 10% of spend refund at 3 nines (99,9%) and 25% of spend at 2 nines (99%).
How is it mediocre? Is it because of the CVEs that have been released in the prior years? I recall GitLab also having quite a bad week of CVEs in February[1].
How is it a bad ecosystem? If this is about plugins in order to do things, I actually like this framework - it lets there be specific owners for portions of the open source development.
Self-implodes? This seems like it would be tracked as a bug. I've encountered an instance where Jenkins wouldn't start due to a crypto issue but that was due to a bug and all I needed to do was install a patch.
I think that using Jenkins can be a thought of a serious option if like anything else, you follow security protocols ie: don't allow public access, maintain RBAC standards, have a maintenance schedule.
[1]https://about.gitlab.com/releases/2022/02/25/critical-securi...
We've been running GitLab on GKE for the past three years, no problems outside of initial migration pains.
https://www.microsoft.com/en-ca/servicesagreement/upcoming.a...
GitHub has a separate policy...
Oh you sweet summer child
There's a point where it's funny, and we're way past that.
For a second I thought someone must have deleted the actions yaml files.
This is a dangerous failure mode.
https://github.com/multiprocessio/dsq/pull/82
Screenshot here: https://twitter.com/phil_eaton/status/1542168020516216832
Jenkins is slow and a nightmare to maintain. It became a huge ball of mud that nobody wants to touch. Just keeping the lights on it's a large burden for the infrastructure team.
Edit: It's just the HN title that says "now resolved." This github status says:
> We have identified the source of disruptions and are actively working on a mitigation. The systems are in recovery and services are returning to green.
Who shouldn't at the very least donate a bit to the various open source CI solutions, as a way to have some kind of hedge?
Each time this happens, it makes no sense to go all in on GitHub. Perhaps companies like ARM, and projects like Wine [1], ReactOS [2], etc already went with self-hosting or have a failsafe solution to fall back on.
[0] https://news.ycombinator.com/item?id=31815918
[1] https://www.phoronix.com/scan.php?page=news_item&px=Wine-Git...
Anyway to minimize the impact of such github incident on everyone's daily projects and business?