What they are more than capable of is to produce really convincing hype, which claims that LLMs can do anything and everything.
Language models are more or less the same thing, except they're trained on human language. Since human language is an encoding of human thought processes, then the model learns the patterns of human thought which are embedded in text, and can both generate and recognize them.
This is why LLMs can "understand" and answer complicated questions that don't exist in their training set, and do things like debug code. Even though it never saw that exact question, it recognizes the high level/abstract concepts that make up the question (since those abstract concepts exist in various different forms in the training set).
So giving an LLM a textual description of an algorithm, and having it detect logical errors in it, is not that unbelievable. Whether current models are good enough to do it well is the question to ask.
The equivalent of an existing handwriting recognizer for e.g. RSA wouldn't be a model that tells you how to crack RSA, it'd be a model that _does_ crack it (maybe by returning a probability distribution over the plaintext that's better than uniform). That feels pretty unlikely to me personally, but maybe that's doable, who knows.
For the standard of "the LLM tells us how to crack RSA," it feels like the equivalent for handwriting recognition would be the LLM outputting a description of a novel algorithm for handwriting recognition (and probably one that isn't just an existing one with some hyperparameters tweaked).
My understanding of the weaknesses of current LLMs is that they're actually pretty bad at that, since they tend to regurgitating existing content when they're able to.
Well, my point was based on the assumption that there is a flaw/backdoor in RSA, but nobody has discovered it yet. So you'd just need a model that can take a textual description of the algorithm as input, and spit out a list of logical errors/flaws.
I'm not saying there is a flaw or backdoor, but if there were, then a LLM would potentially be able to find it, while a team of human experts could miss it.
A model that receives an RSA encrypted payload and outputs the decrypted version would of course be impossible unless the above assumption is true... or you give it access to some compute power so it can either try to brute-force it, or try to track down and hack the servers with the keys :P