This feels different; I asked DeepSeek R1 to give me an autoregressive image generation code in pytorch and it did a marvelous job. Similar for making a pytorch model for a talking lip-synced face; those two would take me weeks to do, AI did it in a few minutes.
Autoregressive LLMs still have some major issues like over-dependency on the first few generated tokens and the problems with commutative reasoning due to one-sided masked attention but those issues are slowly getting fixed.