They already are. I’m successfully using frameworks like bmad to deliver complex apps at that level. My job is to manager the see, as, ux, sre processes and catch errors.
I spend more time refinding prd , epics and stories than I do elbows deep in code.
If I don’t like the output of a story I nuke it change the story and have the flanker try again. I’m using the open source glm, kimi, deepseek models. I expect the full pipeline to be good enough by the end of the year.