A huge leap forward over existing models, but we've spent the last two (three?) decades trying to close the remaining gap left by dragon in the voice to text problem space, and haven't much progress to show.
I think LLMs are likely to be like that. They are a huge jump over previous models of NLP, but I don't see them improving enough to matter to indicate they'll ever make it to AGI