Why does this happen? Multiple reasons. Implementations of big number math can and does contain bugs. (I used to hunt for those via fuzzing, which turned up an amazing number of them.) Hardware failures. Other bugs that corrupt numbers in memory.
The basic attack is well known. Florian Weimer has demonstrated this against TLS in the wild: https://www.redhat.com/en/blog/factoring-rsa-keys-tls-perfec...
The new thing this paper adds is applying this attack to SSH.
There is a countermeasure against this attack, and this is to verify the signature before revealing it. It works. As the paper says, openssh uses openssl's RSA implementation, and it has been doing that since forever (2001).
So in summary: Applying a well-known attack against RSA to its use in SSH. Only works if you have an RSA implementation that outputs results of flawed computations. Countermeasures exist, and RSA implementations should use them.
Yeah, there are pitfalls in RSA, but so are there in elliptic curve algorithms. I'm not sure if I'd say RSA has more pitfalls than ECDSA. Ed25519 is better, but so is RSA-OAEP/PSS over PKCS #1.5, where many of the common RSA pitfalls are. Most of the RSA pitfalls are in RSA encryption, which is irrelevant if you only use signatures.
You don't have to select primes very carefully. You can select random numbers, check if they are of the right size and prime, and you're good. What some people tend to do is to select primes in a "smart" way, which this post rightfully points out is problematic (see ROCA). But they also reference the batchgcd issue, which is not really an RSA issue. It is ultimately a bad RNG issue, and the very same issue also caused ECDSA implementations to break (with the same catastrophic "you reveal your private key" result).
My comments on "Seriously, stop using RSA":
Upstream OpenSSH uses LibreSSL, which they've forked from OpenSSL precisely because they were concerned with code quality and correctness.
I don't know whether this problem affects LibreSSL, but one of their main goals was to be less afraid of breaking the OpenSSL API to fix usability problems that lead to incorrect (insecure) code.
Look for the comment:
/*
* 'I' and 'vrfy' aren't congruent mod n. Don't leak
* miscalculated CRT output, just do a raw (slower)
* mod_exp and return that instead.
*/I'm curious, can you give some examples what kind of bugs did you discover?
Ultimately: things like a^b or a/b would return wrong results for certain inputs.
That's such an obvious-in-hindsight idea. I love it.
And they say crypto is hard, sheesh...
Seriously though, almost every time I hear about some new (to me) attack, I get amazed at the ingenuity of people.
Damn those Chinese hackers again!
My point was exactly that, it's bloody hard. Not only implementing it correctly, but all the non-obvious ways it can go wrong that'spartially out of your control due to non-ideal hardware (in the mathematical sense). Timing attacks, cache leaks, speculation leaks, this...
> Our combined dataset of around 5.2 billion SSH records contained more than 590,000 invalid RSA signatures.
Am I reading this right? This is about 1 in 10_000, this is way more common that what I would have imagined
Such bugs tend to show up in crappy IoT hardware. IoT hardware often comes in large numbers.
If you scan the IPv4 space for SSH hosts, most of the ones you'll find are IoT hardware.
* In (rare) vulnerable targets, this allows you to recover the host's key, and thus impersonate a host. You can't compromise client credentials with this attack, since client credentials are exchanged after the (active) secure channel is established. If you can impersonate a host, as this attack would allow you to do, you could capture client password credentials, and you can drive a forwarded agent.
* OpenSSH --- really, SSH servers on any Unix host you've been using in the last 20 years --- isn't vulnerable to this attack. The vulnerability is publishing a signature that is validly signed under RSA p and not under RSA q. Solution: just never do that; when you generate the signature, check it yourself before publishing. This is one of the better-known attacks on RSA, so this is a standard implementation countermeasure.
* The things that are vulnerable are crappy middleboxes from Zyxel, Mocana, apparently a rare subset of Cisco devices, and whatever "SSH-2.0-SSHD" is (the authors don't know either).
* This is a Nadia Heninger paper, and Heninger is, like, the modern master of the Coppersmith RSA attack, which transforms an RSA problem into a series of polynomials and then transforms those into a linear algebra problem set in a lattice (roughly: a vector space with exclusively integer components; really, when we say "lattice" we mean "some generated basis for that lattice"). You then use the LLL algorithm to reduce the basis, which gives you small vectors that, when reframed back into polynomials or whatever, can tractably solved for their roots. Get the intuition? Yeah, I mean, me neither. Lattice attacks on PQ crypto have a simpler intuition! But the lattices bases here are just R^3 matrices, so, that's pretty simple.
* You can get the intuition for the underlying vulnerability much more simply. From the paper: Boneh, DeMillo, and Lipton noted that if an attacker had a correct signature s and an incorrect signature s_hat of this form then the attacker could compute gcd(N, s_hat − s) = p. The complicated math comes from the fact that while we have the incorrect signature we're hoping for, we don't have the correct signature over the same message, or a fully known message.
* This attack is made possible by our old friend PKCSv1.5, this time in a signing setting. It works because a P1v1.5 RSA signature has regular format: 00 01 FF ... FF 00 aa .. aa hh .. hh, where aa are the (known) bits of the ASN.1 identifier of the hash, and hh are the (unknown) bits of the hash. Everything but the bit values of hh is known to the attacker.
* Amusing detail: the attack relies on a condition of the unknown bits being less than 1/4 of the RSA message (modulus) size, so the attack actually gets harder for RSA-1024 with better hashes, and is impossible for RSA-1024 with SHA2-512, which blows that budget.
* Another thing that uses PKCSv1.5-RSA signatures is DNSSEC. You could scan the Internet collecting DNSSEC signatures hoping to find some that don't validate (I think it's RIPE that periodically does surveys looking for invalid DNSSEC records, and routinely finding them?), and because all RSA DNSSEC is in viable parameters for this attack I guess recover keys from it? Or you could just not use DNSSEC. I guess maybe this is particularly problematic for "online-signers"; most DNSSEC signatures are computed offline, so you can't just repeatedly ask for new signatures waiting for a fault, but you could with an online signer.
But imagine if the problem is slightly changed to f(x) = (x + padding)^3 - a mod N, for some large but known padding. Now you can't simply compute an integer cube root, and the above doesn't work anymore. But you can look for a product of f(x) by some other polynomial g(x): f(x)g(x) necessarily contains the roots of f(x), but may have smaller coefficients such that evaluating f(x)g(x) at the root you're looking for does not wrap around N.
Now look into what this multiplication looks like: f(x)g(x) = c_0 f(x) + c_1 x f(x) + c_2 x^2 f(x) + ..., where c_i are the unknown coefficients of g(x). This is a sum of integer multiples of known coefficient vectors and, to find the set of coefficients c_i such that this sum is a small as possible is exactly the same as finding a short vector in the lattice (f(x), xf(x), x^2f(x), ...), where the vectors are the coefficients of each (shifted) polynomial. You also have to weight the coefficients according to their exponent, to account for the fact that the coefficient of, say, x^3 needs to be smaller than x to avoid wrap-around.
Once lattice reduction gives you back an f(x)g(x) with small enough coefficients you compute its integer roots, which is doable very quickly, and check which of them work for f(x). That's the gist of it. The full-fledged method also includes powers of f(x) modulo powers of N instead of simply multiplying it by g(x), and that was the trick Coppersmith introduced, but if you understand the simple version the full version is straightforward.
I think this is from the j2ssh/maverick SSH server, used in a bunch of enterprisey Java products.
https://jadaptive.com/en/products/java-ssh-server
https://github.com/sshtools/j2ssh-maverick/blob/ce11ceaf0aa0...
These are hardware features where a private key is hardcoded in the chip and never supposed to be revealed. You can ask the chip to sign things for you. It has some anti-tampering measures, but it might be possible to induce faults without too much effort, if you apply heat, EM ("cosmic rays"), and play with voltage/frequency a little
https://www.usenix.org/conference/usenixsecurity17/technical...
The researchers made an app that can run as a normal user and extract the hardware enclave’s private key.
DVFS aside, there's plenty of ways to stress and CPU and cause random errors. I don't know whether their RSA implementation protects against this attack, though.
While at first it may seem an unlikely attack, it's probably more real than you'd think, given the number of times any single server does TLS negotiation using a given private key. The attack becomes even more likely when you realize that multiple servers will be using the private key.
In practice, this gives middle boxes more power, and raises their profile in the threat model significantly. This also opens up the possibility of simply collecting failed transient failed tls negotation data from a large number of (legitimate) clients to reconstruct a private key.
now put your tinfoil hat on and suppose you worked for a paramilitary organization that had infiltrated the top 2 semiconductor manufacturers. You persuade the silicon designers, when implementing hardware accelerated crypto (or "management engines") to not do their jobs quite perfectly, no just leave room for a tiny bit of....error. Could never happen, right?
Edit: Don't ask me questions, i don't know shit, i just rephrased stuff from the linked paper.
i wasn't wrong