But three things mentioned in their report do give me some confidence about the way CircleCI has engineered their internal systems:
1. They use SSO with 2FA ("an unauthorized third party leveraged malware deployed to a CircleCI engineer's laptop in order to steal a valid, 2FA-backed SSO session")
2. They maintain reasonably good audit logging (they could identify that "the third party extracted encryption keys from a running process, enabling them to potentially access the encrypted data" which had been exfiltrated)
3. They can rebuild everything from scratch ("we rotated all potentially exposed production hosts to ensure clean production machines")
A lot of companies pay lip service to best practices like these, but don't actually implement them thoroughly (or at all). The fact that CircleCI could rely on them under attack makes me think they're doing a better job than 90% of the SaaS companies out there.
>While one employee’s laptop was exploited through this sophisticated attack, a security incident is a systems failure. Our responsibility as an organization is to build layers of safeguards that protect against all attack vectors.
I was surprised by this part:
>To date, we have learned that an unauthorized third party leveraged malware deployed to a CircleCI engineer’s laptop in order to steal a valid, 2FA-backed SSO session.. the malware was able to execute session cookie theft, enabling them to impersonate the targeted employee in a remote location and then escalate access to a subset of our production systems.
I'm surprised the SSO session token isn't bound to an IP address. I'd also expect access to prod overall to be whitelisted to CircleCI-owned IP ranges.
Now some gripes:
* I never received an advisory email about this incident. I only received this follow-up to one of my Github machine accounts, not my primary billing account.
* Their secret-finding script is pretty bad. It just dumps out a bunch of metadata without helping to make it actionable. Environment variables still don't have a created_at field, so you can't verify which ones you might have missed in a broad key rotation.
I used to live somewhere where outbound traffic went through one of three CGNAT IPs at random, and I only had auth issues with one really old site that predates the NAT hell that is the modern internet.
It would be possible to do some kind of check for "this session token was used in the US and Russia twenty minutes apart... something's fishy," but that adds in more complexity.
If you are concerned about stable IPs, use a proper VPN or bastion setup.
Did they? I got an email this morning that pointed to _this_ blog post, but I never received any initial "rotate yo keys" communication from them, on any email address.
If I hadn't read HN, and none of my company had, and our use of CI was running smoothly (they eventually put up a banner in the UI), I would literally have never known until this email.
What kind of freaks me out about this is that a customer notified Circle? If that customer hadn't of mentioned anything, where would we be now?
I have to say, it's a pretty impressive hack. I wonder who or what was behind it?
Also wondering why / how the attacker didn't get access to the runners?
> Though all the data exfiltrated was encrypted at rest, the third party extracted encryption keys from a running process, enabling them to potentially access the encrypted data.
E.g.: did a new employee get access to production systems? Were there not enough people to monitor the systems and detect the breach sooner? Etc.
Shouldn't that have been the case from the beginning? Why did more than a small group of employees have production access at all?