And actually, perhaps PKI is not that good for this case all together. Instead we could extend the original idea with simple primitives like an infinite hash chain (https://ieeexplore.ieee.org/document/7509492). In this scheme, during every authentication round, a user reveals a pre-committed secret and simultaneously commits to a new one for the next interaction. This approach is already used on websites where authentication tokens are exchanged based on known hashes, and there are proven methods to keep these tokens continuously updated. It relies solely on hashes — just like your scheme — and can work by having both parties scan each other’s QR codes on every interaction, which both performs an authentication check and also updates the application’s state each round.
The beauty of this method compared to PKI is first, it is based on a weaker assumption, but more importantly is that even if an attacker intercepts the initial QR code, they cannot afford to miss any message exchange, or they’ll lose the ability to authenticate. Moreover, if an attacker ever impersonates a party by following the protocol, the genuine authentication sequence will break down, revealing a discrepancy that exposes the impersonation.
And it should not be too hard to build, so I might give it a try.
If all authentication keys (QR codes, TOTP codes, even PKI) are exchanged in the communication channel and do not authenticate the communication channel feed itself, the attacker can simply forward them between the two victims, maintaining a perfect “bridge” with no obvious sign of tampering. Once the authentication phase is complete, they can terminate the redundant call and continue conversation with the target having passed the authentication.
It seems to me that the only way against it is to authenticate messages (text or feed) themselves, and for that we go back to regular MACs that are already used today.
In an "ideal" world:
- everybody should start using public/private key cryptography to authenticate each other, but that's still rather unwieldy nowadays. I'm not aware of any solution with a good UX;
- people would stop posting their photos/videos/audio recordings on the web, and also scrub anything that have been uploaded in the past.
We don't live in an "ideal" world, and TOTP is pretty widespread now, and you can easily read the TOTP code over the phone, etc. So this solution was born.
"Malice" could ask Bob for his code, and lie about it matching (or maybe Malice has no code at all and is pretending to match), lulling Bob into thinking that authentication was successful based on taking Malice's word for it.
Seems like you would need two codes for mutual authentication. One for Alice to Bob, and one for Bob to Alice.
This was on Firefox Nightly on Windows 11.
In any case, I have just added the display of the base32 secret key.