What can we learn from the matrix.org compromise? (opens in new tab)

(medium.com)

84 pointscyber7y ago73 comments

73 comments

45 comments · 11 top-level

inetknght7y ago· 9 in thread

I have gone on some long verbal rants about the dark patterns (bordering on malicious behavior) exhibited by key agents such as SSH agent, GPG agent, Pageant, and the like.

What can you learn from the compromise? Never use an agent. Kill it with fire^H^H^H^H -9.

bifrost7y ago

The attacker still would have gotten their key in. TBH if you kill the agent people are just going to copy their keys with no passphrases around. Ask me how I know...

inetknght7y ago

If only a password manager could add a sane agent and UI

1 more reply

nine_k7y ago

How about using hardware tokens instead? With a right setup, private keys never leave it.

inetknght7y ago

Hardware tokens are pretty alright until you need to use GPG-agent to enable their use.

bifrost7y ago

Hardware tokens are great but most people don't know how to use them, so they don't.

scurvy7y ago

Smart cards? They were designed for this.

1 more reply

yjftsjthsd-h7y ago

Okay, I'll bite. What are you calling a dark pattern in assorted agents? Especially given that dark pattern implies intent to harm. (And I say this as someone looking at using an agent: If there's a gotcha, I'd like to know about it)

inetknght7y ago

1) SSH agent will cache your passphrase. While that's the whole purpose of SSH agent, remember that nothing is more insecure than an unlocked secret.

2) SSH agent often starts automatically, frequently without user interaction (even if you specify `-i keyfile`). SSH client and DBus are both culprits here; there's also other culprits too.

3) There are often multiple different agents installed on Linux desktop systems. For example, ssh-agent, gnome-keyring, seahorse, gpg-agent ... the list goes on. Good luck auditing that.

4) Without `-i keyfile`, SSH will present to the remote server all keys, in sequence, cached in your agent (and will cause trouble with active firewalls from too many authentication attempts)

5) If the keyfile you specified in `-i keyfile` does not authenticate, then SSH will fall-back to using keys cached in your agent. That's especially frustrating since you might want to know that the key you specified was rejected!

6) Removing the executable flag from ssh-agent is not a permanent solution: updates will often overwrite the program with a new file and reset the executable bit. Obviously the same goes for renaming the program (that one causes a hell of a lot more noise in logs btw; programs seem to complain more if a program can't be found instead of just not being executable)

7) See also (related) concerns I posted about GPG agent on Stack Overflow [1]

Last, but not least: 8) Hope you don't use a system where agent forwarding or agent caching is turned on in the system settings!

[1] https://stackoverflow.com/q/47273922/1111557

3 more replies

giggles_giggles7y ago

If you use ssh-agent with default settings it's very easy to accidentally expose access to systems you would not expect via the agent.

This seems to be a good post about the problem: https://heipei.io/2015/02/26/SSH-Agent-Forwarding-considered...

The key takeaway is that using ssh -A with default settings allows root on the system you've connected to "to impersonate you to any host as long as you’re connected".

2 more replies

pm907y ago· 6 in thread

This is such a poorly written article:

* no detailed analysis of how the attack was undertaken. Its not even clear how the attacker managed to get in (was it a publicly exposed Jenkins? vulnerable bastion? what?)

* no analysis of what the existing matrix.org security perimeter looked like or how it could be made better.

* repetition of security tropes. Use VPN. Use Github Enterprise (wait wtf? Why not private repos in Github?). Don't use Ansible, use salt.

Ridiculous. I was looking forward to a nice long read about how this breach was undertaken. Hugely disappointed.

bifrost7y ago

If you click through to the GH Issues I linked to there are some pretty good data points as to what happened. I didn't feel the need to copypasta.

But yes, publicly exposed jenkins and repos lead to the compromise, not an uncommon story unfortunately.

Perimeter - I didn't see much evidence of one existing and I didn't go probing their networks to find out.

Security tropes are real for a reason, you don't have to believe me though.

Private repos in GitHub are still publicly hosted and are orders of magnitude easier to get into than having an in perimeter repo. They've leaked before and they'll keep on leaking. GitHub even made it harder for people to fork private repos to their own public accounts but it still happens.

pm907y ago

> They've leaked before and they'll keep on leaking. GitHub even made it harder for people to fork private repos to their own public accounts but it still happens

Can you provide some actual instances of this happening? Genuinely curious, as my org is currently migrating from enterprise to cloud.

1 more reply

bobwaycott7y ago

> But yes, publicly exposed jenkins and repos lead to the compromise …

You mean the past-tense verb led, not its metallic homonym lead. :)

nobatron7y ago

There's a lot wrong with this article.

Firstly having a private network for your infrastructure isn't a one stop solution for keeping attackers out.

Secondly using Github Enterprise or self hosted GitLab doesn't make up for storing secrets in Git.

Looking forwards to the proper write up.

bifrost7y ago

I've never claimed it was a "one stop", but it certainly keeps the random internet users to a minimum.

And yes, using GHE or self hosted GitLab doesn't make up for storing secrets, but it at least keeps them out of the public eye so the effects are less brutal. Its still bad to store secrets in a code repository.

My whole point is that you can reduce risks easily, yet some people don't for some reason.

netsectoday7y ago

* this idiot claimed "Ansible was used to keep the attacker in the system" which in all reality Ansible did what it was supposed to by altering the correct authorized_keys file and the attacker leveraged an old default in the sshd config. This is a sshd config issue, not Ansible.

The sales-pitch for Salt (against Ansible) is ridiculous and misguided.

I just checked out the Salt SSH module and even if they used salt they would still have this issue. Then answer here is to not use the default /etc/ssh/sshd_config value of #AuthorizedKeysFile .ssh/authorized_keys .ssh/authorized_keys2. Uncomment and remove authorized_keys2.

1 more reply

KirinDave7y ago· 6 in thread

Why aren't people reporting the fact that Matrix.org actually lost control of their network a second time within hours of their first all clear sounding?

I feel like this is an important part of the story for anyone looking for teachable infosec moments.

bifrost7y ago

I guess I technically glossed over that but I did say "One of the more interesting pieces of this was how Ansible was used to keep the attacker in the system". The attacker was persisted via CM and their public repo, I'm actually surprised this doesn't happen more often.

bifrost7y ago

I should clarify this comment a bit since it seems to be the most controversial.

When I say the attacker was persisted via CM, I'm pointing at his own notes, nodding to broken CM, the requirements of supporting the CM and availability of the config files.

I also sanity checked the sshd_config file on my systems, they're all set to a sane default:

"AuthorizedKeysFile .ssh/authorized_keys"

FWIW I prefer to treat CM data as "valuable" information for this reason.

driminicus7y ago

Because the second tine was a dns hijack, not a network compromise. I'm a little fuzzy on the details, but it had something to do with cloudflares API not revoking some access token.

Either way, a DNS hijack is not great, but not nearly as bad as the initial compromise.

bifrost7y ago

It wasn't CloudFlare's API not revoking a token, they just didn't revoke all the tokens. Basically human error.

"The API key was known compromised in the original attack, and during the rebuild the key was theoretically replaced. However, unfortunately only personal keys were rotated, enabling the defacement."

KirinDave7y ago

See, I'd like to know more too.

Arathorn7y ago

The rebuilt infra wasn’t compromised; what happened was that we rotated the cloudflare API key whilst logged into CF with a personal account but then masquerading as the master admin user. Turns out that rotating the API key rotates your personal one, not the one you’re masquerading as, and we didn’t think to manually compare the secret before confirming it had the right value. Hence the attacker was able to briefly hijack DNS to their defacement site until we fixed it.

We will write this up in a full postmortem in the next 1-2 weeks.

nisa7y ago· 3 in thread

It's been a few years since I last used Saltstack but if you have access to the master you have instant root on all minions or did that somehow change? salt '*' cmd.run 'find / -delete' and game-over?

bifrost7y ago

Very true, however I'd rather have that problem than an ever multiplying number of user accounts on systems that can su/sudo.

verdverm7y ago

Make golden images with packer, or something similar, and then roll your fleet over.

You should not be running package managers on production servers. Or any of the other things salt, ansible, chef, puppet can do.

2 more replies

_frkl7y ago

How does saltstack do tasks that require root access? Use the root user directly?

1 more reply

ubercow137y ago· 3 in thread

Why is it considered safer to expose a VPN to the internet than SSH? Is it just that there is one exposed service for the organisation rather than one per machine?

bifrost7y ago

SSH tunneling is handy but if you want to push anything else over it, its a pain for the "layperson". You're not going to have a great time supporting people with it. I've done it, it sucks. Scripts and special SSH config files are the pits. VPNs are way easier, they can support multiple access levels and roles, are often not blocked by other people's packet filters and firewalls and the good ones can even validate that a host is in "compliance" before they're allowed onto the network.

closeparen7y ago

You can expose one SSH box per organization (a “bastion”) and deploy SSH configs to clients that make it look like you have direct access to the hosts behind it.

acct17717y ago

That'd probably be a solid question that the people implementing WireGuard in Linux kernel/supporting that can cover.

forgotmypw7y ago· 2 in thread

I'd like to take this opportunity to plug my in-development decentralized, distributed, completely open forum, using PGP as the "account" system, and text files as the data store.

So any reasonably competent hacker can re-validate the entire forum's content and votes, reasonably quickly reimplement the whole thing, and/or fork the forum at any time.

http://shitmyself.com/

ficklepickle7y ago

This is very interesting! I have so many questions. If you see this, kindly send me an email. It's in my profile. I love the idea!

bifrost7y ago

Very Cool! I'll check it out!

mjevans7y ago· 2 in thread

That medium.com has a paywall and doesn't want to share content? (is what I learned)

bifrost7y ago

This might work: https://medium.com/@tomsparks/what-can-we-learn-from-the-mat...

tomupom7y ago

Not getting the same paywall trouble as you but https://outline.com/PZnDHL

Arathorn7y ago· 1 in thread

If it wasn’t clear, this article wasn’t written by the Matrix.org team, nor did the author discuss any of it with us to our knowledge.

We’ll publish our own full post-mortem in the next 1-2 weeks.

Arathorn7y ago

also, reading this article more carefully, much of this just plain wrong:

> One of the more interesting pieces of this was how Ansible was used to keep the attacker in the system.

Fwiw the infra that was compromised was not managed by Ansible; if it had been we would likely have spotted the malicious changes much sooner.

krupan7y ago· 1 in thread

Can anyone explain the Jenkins vulnerability that was used to initially gain access? Reading the CVEs didn't give me the impression that they enabled remote exploits

bifrost7y ago

My 5 second lazy summaries of the CVEs:

CVE-2019-1003001, CVE-2019-1003002 -> Anyone with read access to Jenkins can own the build environment.

CVE-2019-1003000 -> I didn't get a lot of the details on this but it basically looks like "broken sandboxing, you can run bad scripts".

This is also a good resource: https://packetstormsecurity.com/files/152132/Jenkins-ACL-Byp...

zimbatm7y ago· 1 in thread

The attacker gained network access through Jenkins.

Don't deploy a public-facing Jenkins, especially if it has credentials attached to it. It's really hard to secure, especially if pull-requests can run arbitrary code on your agents.

Jenkins / CI is the sudo access to most organizations.

bifrost7y ago

I agree with you 100% here, I would not deploy any CI publicly unless its heavily fenced off into "read only" territory.

r1ch7y ago

One thing I learned was where to modify the pageant source code (Windows equivalent of ssh-agent) to make my agent prompt before signing (with the default focus on "no"). This feels much safer and is a very minor inconvenience. I wonder why more agents don't have this built in.

Example: https://twitter.com/R1CH_TL/status/1118559239084158977

j / k navigate · click thread line to collapse

73 comments

45 comments · 11 top-level

inetknght7y ago· 9 in thread

I have gone on some long verbal rants about the dark patterns (bordering on malicious behavior) exhibited by key agents such as SSH agent, GPG agent, Pageant, and the like.

What can you learn from the compromise? Never use an agent. Kill it with fire^H^H^H^H -9.

bifrost7y ago

The attacker still would have gotten their key in. TBH if you kill the agent people are just going to copy their keys with no passphrases around. Ask me how I know...

inetknght7y ago

If only a password manager could add a sane agent and UI

1 more reply

nine_k7y ago

How about using hardware tokens instead? With a right setup, private keys never leave it.

inetknght7y ago

Hardware tokens are pretty alright until you need to use GPG-agent to enable their use.

bifrost7y ago

Hardware tokens are great but most people don't know how to use them, so they don't.

scurvy7y ago

Smart cards? They were designed for this.

1 more reply

yjftsjthsd-h7y ago

inetknght7y ago

1) SSH agent will cache your passphrase. While that's the whole purpose of SSH agent, remember that nothing is more insecure than an unlocked secret.

2) SSH agent often starts automatically, frequently without user interaction (even if you specify `-i keyfile`). SSH client and DBus are both culprits here; there's also other culprits too.

3) There are often multiple different agents installed on Linux desktop systems. For example, ssh-agent, gnome-keyring, seahorse, gpg-agent ... the list goes on. Good luck auditing that.

4) Without `-i keyfile`, SSH will present to the remote server all keys, in sequence, cached in your agent (and will cause trouble with active firewalls from too many authentication attempts)

7) See also (related) concerns I posted about GPG agent on Stack Overflow [1]

Last, but not least: 8) Hope you don't use a system where agent forwarding or agent caching is turned on in the system settings!

[1] https://stackoverflow.com/q/47273922/1111557

3 more replies

giggles_giggles7y ago

If you use ssh-agent with default settings it's very easy to accidentally expose access to systems you would not expect via the agent.

This seems to be a good post about the problem: https://heipei.io/2015/02/26/SSH-Agent-Forwarding-considered...

The key takeaway is that using ssh -A with default settings allows root on the system you've connected to "to impersonate you to any host as long as you’re connected".

2 more replies

pm907y ago· 6 in thread

This is such a poorly written article:

* no detailed analysis of how the attack was undertaken. Its not even clear how the attacker managed to get in (was it a publicly exposed Jenkins? vulnerable bastion? what?)

* no analysis of what the existing matrix.org security perimeter looked like or how it could be made better.

* repetition of security tropes. Use VPN. Use Github Enterprise (wait wtf? Why not private repos in Github?). Don't use Ansible, use salt.

Ridiculous. I was looking forward to a nice long read about how this breach was undertaken. Hugely disappointed.

bifrost7y ago

If you click through to the GH Issues I linked to there are some pretty good data points as to what happened. I didn't feel the need to copypasta.

But yes, publicly exposed jenkins and repos lead to the compromise, not an uncommon story unfortunately.

Perimeter - I didn't see much evidence of one existing and I didn't go probing their networks to find out.

Security tropes are real for a reason, you don't have to believe me though.

pm907y ago

> They've leaked before and they'll keep on leaking. GitHub even made it harder for people to fork private repos to their own public accounts but it still happens

Can you provide some actual instances of this happening? Genuinely curious, as my org is currently migrating from enterprise to cloud.

1 more reply

bobwaycott7y ago

> But yes, publicly exposed jenkins and repos lead to the compromise …

You mean the past-tense verb led, not its metallic homonym lead. :)

nobatron7y ago

There's a lot wrong with this article.

Firstly having a private network for your infrastructure isn't a one stop solution for keeping attackers out.

Secondly using Github Enterprise or self hosted GitLab doesn't make up for storing secrets in Git.

Looking forwards to the proper write up.

bifrost7y ago

I've never claimed it was a "one stop", but it certainly keeps the random internet users to a minimum.

My whole point is that you can reduce risks easily, yet some people don't for some reason.

netsectoday7y ago

The sales-pitch for Salt (against Ansible) is ridiculous and misguided.

1 more reply

KirinDave7y ago· 6 in thread

Why aren't people reporting the fact that Matrix.org actually lost control of their network a second time within hours of their first all clear sounding?

I feel like this is an important part of the story for anyone looking for teachable infosec moments.

bifrost7y ago

I should clarify this comment a bit since it seems to be the most controversial.

When I say the attacker was persisted via CM, I'm pointing at his own notes, nodding to broken CM, the requirements of supporting the CM and availability of the config files.

I also sanity checked the sshd_config file on my systems, they're all set to a sane default:

"AuthorizedKeysFile .ssh/authorized_keys"

FWIW I prefer to treat CM data as "valuable" information for this reason.

driminicus7y ago

Because the second tine was a dns hijack, not a network compromise. I'm a little fuzzy on the details, but it had something to do with cloudflares API not revoking some access token.

Either way, a DNS hijack is not great, but not nearly as bad as the initial compromise.

bifrost7y ago

It wasn't CloudFlare's API not revoking a token, they just didn't revoke all the tokens. Basically human error.

"The API key was known compromised in the original attack, and during the rebuild the key was theoretically replaced. However, unfortunately only personal keys were rotated, enabling the defacement."

KirinDave7y ago

See, I'd like to know more too.

Arathorn7y ago

We will write this up in a full postmortem in the next 1-2 weeks.

nisa7y ago· 3 in thread

bifrost7y ago

Very true, however I'd rather have that problem than an ever multiplying number of user accounts on systems that can su/sudo.

verdverm7y ago

Make golden images with packer, or something similar, and then roll your fleet over.

You should not be running package managers on production servers. Or any of the other things salt, ansible, chef, puppet can do.

2 more replies

_frkl7y ago

How does saltstack do tasks that require root access? Use the root user directly?

1 more reply

ubercow137y ago· 3 in thread

Why is it considered safer to expose a VPN to the internet than SSH? Is it just that there is one exposed service for the organisation rather than one per machine?

bifrost7y ago

closeparen7y ago

You can expose one SSH box per organization (a “bastion”) and deploy SSH configs to clients that make it look like you have direct access to the hosts behind it.

acct17717y ago

That'd probably be a solid question that the people implementing WireGuard in Linux kernel/supporting that can cover.

forgotmypw7y ago· 2 in thread

I'd like to take this opportunity to plug my in-development decentralized, distributed, completely open forum, using PGP as the "account" system, and text files as the data store.

So any reasonably competent hacker can re-validate the entire forum's content and votes, reasonably quickly reimplement the whole thing, and/or fork the forum at any time.

http://shitmyself.com/

ficklepickle7y ago

This is very interesting! I have so many questions. If you see this, kindly send me an email. It's in my profile. I love the idea!

bifrost7y ago

Very Cool! I'll check it out!

mjevans7y ago· 2 in thread

That medium.com has a paywall and doesn't want to share content? (is what I learned)

bifrost7y ago

This might work: https://medium.com/@tomsparks/what-can-we-learn-from-the-mat...

tomupom7y ago

Not getting the same paywall trouble as you but https://outline.com/PZnDHL

Arathorn7y ago· 1 in thread

If it wasn’t clear, this article wasn’t written by the Matrix.org team, nor did the author discuss any of it with us to our knowledge.

We’ll publish our own full post-mortem in the next 1-2 weeks.

Arathorn7y ago

also, reading this article more carefully, much of this just plain wrong:

> One of the more interesting pieces of this was how Ansible was used to keep the attacker in the system.

Fwiw the infra that was compromised was not managed by Ansible; if it had been we would likely have spotted the malicious changes much sooner.

krupan7y ago· 1 in thread

Can anyone explain the Jenkins vulnerability that was used to initially gain access? Reading the CVEs didn't give me the impression that they enabled remote exploits

bifrost7y ago

My 5 second lazy summaries of the CVEs:

CVE-2019-1003001, CVE-2019-1003002 -> Anyone with read access to Jenkins can own the build environment.

CVE-2019-1003000 -> I didn't get a lot of the details on this but it basically looks like "broken sandboxing, you can run bad scripts".

This is also a good resource: https://packetstormsecurity.com/files/152132/Jenkins-ACL-Byp...

zimbatm7y ago· 1 in thread

The attacker gained network access through Jenkins.

Don't deploy a public-facing Jenkins, especially if it has credentials attached to it. It's really hard to secure, especially if pull-requests can run arbitrary code on your agents.

Jenkins / CI is the sudo access to most organizations.

bifrost7y ago

I agree with you 100% here, I would not deploy any CI publicly unless its heavily fenced off into "read only" territory.

r1ch7y ago

Example: https://twitter.com/R1CH_TL/status/1118559239084158977

j / k navigate · click thread line to collapse