For fun, read that last paragraph out loud to a non-techy near by and watch their eyes...
"Our recommendation here is to make use of the latest coturn which by default, no longer allows peering with 127.0.0.1 or ::1. In some older versions, you might also want to use the no-loopback-peers."
> So Slack's VoIP uses WebRTC, which connects via UDP/TCP to always send SRTP packets through a TURN proxy (which extends STUN via ICE) to work around usual NAT problems. These guys scanned the TURN and found an SSRF which allowed them to connect to Slack's VPC on AWS using IAM temporary credentials. Interesting.
I'm still developing on it a bit but my solution is open source [1]. If anybody want to use this I'm happy to provide answers to questions, and quick bug fixes (as this directly benefits my work right now). If you're using kubernetes this is a pretty easy drop in for your pod. It's part of our default setup now.
If you're exclusively interested in mitigating SSRF, a more targeted solution is to run your connections (HTTP or TCP) through a proxy that enforces network-level rules. That seems like it would have worked here. For HTTP SSRF, Stripe has a good tool, Smokescreen.
For example, I have Grafana backed by Postgres, and they both understand this authentication scheme out of the box. Postgres is happy to be provided with a cert to present to connecting applications, and is happy to check the cert that applications present against the CA cert.
The main problem with my setup is that I use a ClusterIssuer CA, so really anyone in the cluster can get a valid certificate. This is not amazingly secure and things like Istio do a bit more provenance checking of the application before issuing a cert, which I like. But this is simple, and does protect against the attack this article covers -- as long as you don't go out of your way to present the application's cert when proxying a connection. (Which is probably an easy mistake to make, so be careful.)
MeTaLS won't provide you with client-related stuff, but most clients and client libs make it easy to set a certificate/key with a request.
I'm not sure if any of the PSK modes manage to work with perfect forward secrecy though. Otherwise, leaking the PSK would also allow decrypting any previously-sniffed traffic.
What is the advantage over simply routing the media streams through application servers (i.e. user A connects to server which links to user B) which can then perform application-specific authentication, enforce restrictions on payloads, etc... Performance?
However, many firewalls block port ranges, or even UDP entirely. What you really want is a way to let people speak WebRTC over a common port (443 TCP is almost never blocked.) TURN facilitates this. Sometimes it's built into SFUs, sometimes not, and requires coturn in front of it. In Slack's case (and the project I work on as well) they are running Janus, which does not have TURN built in, and hence, run coturn to facilitate TURN.
Slacks's approach is particularly interesting because they always push people through TURN, instead of allowing direct connectivity to their SFU. Hard to say why exactly, but probably it's a mix of locking down SFU onto the private network for some reasons, being able to push TURN to edge but keep SFU on private LAN, etc. Typically you don't do this I don't think, you run TURN and SFU both with public IPs, and the client connects to one or the other depending on what ICE candidates win (which is a function of their firewall rules: your browser tries to pick the 'best' candidate it can get to, ideally one over UDP without a TURN hop.)
Someone is doing this right now for Pion, really excited to see it. I am especially excited to see what it means for deploys, right now asking people to expose port ranges adds so much overhead vs 1 UDP and 1 TCP for media.
Rather than falling back from p2p to STUN to TURN, why not replace TURN with something more application/protocol-specific?
Perhaps a webrtc-only proxy that performs authentication and can perform authorization along the lines of: user A is (only) allowed to connect to user B using protocol WebRTC.
I'm assuming, perhaps incorrectly, that most of these RTC connections are happening over NAT and therefore usually go over TURN rather than by connecting directly. Even if that's not the case, why not try direct p2p connection first then fall back to routing through an application-specific proxy, which can have tighter controls on who connects to who and what payloads they send?
https://webrtchacks.com/slack-webrtc-slacking/
Given that the latest coturn has this vulnerability mitigated by default, perhaps all this boils down to is "Slack runs outdated software, we exploited it."?
(This just seems to be one of those bugs that every proxy goes through at some point, just like pretty much any attempt to write a web server that serves files off disk will have at least one directory traversal bug.)
November 2017: added TURN abuse to our stunner toolset
December 2017: discovered and reported TURN vulnerability in private customer of Enable Security
February 2018: briefly tested Slack and discovered the vulnerability
April 2018: submitted our report to Slack, helped them reproduce and address the issue through various rounds of testing
May 2018: Slack pushed patch to live servers which was retested by Enable Security
January 2020: asked to publish report
February 2020: disclosure delayed by HackerOne/Slack
March 2020: report published
Just add extra linebreaks
November 2017: added TURN abuse to our stunner toolset
December 2017: discovered and reported TURN vulnerability in private customer of Enable Security
February 2018: briefly tested Slack and discovered the vulnerability
April 2018: submitted our report to Slack, helped them reproduce and address the issue through various rounds of testing
May 2018: Slack pushed patch to live servers which was retested by Enable Security
January 2020: asked to publish report
February 2020: disclosure delayed by HackerOne/Slack
March 2020: report published