Yes and it's working? People ARE CURRENTLY building things.
They are NOT currently whinning that LLM is not smart enough so they must sit and wait for the next model to be able to code any problem, (again RELIABLY) on demand.
> People are trying to get these things to do complex multi-step reasoning tasks, including making changes to their codebase (?!), automating behaviors as "agents" that need to predictably and secure function...
You understand that all these are powered by tools calling underneath? The planning and orchestration of tasks follows a structure, to be fed into tools. This is why a plain model cannot do shit, people have to make products with the right tools on top to make a llm behave the way they want, and not just chit chat endlessly.
The abitrary execution approach, if it ever works, is by building tools and MCP servers for code execution. Because obviously it's not the LLM server who executes code.
You clearly have never thought about how to actually build any of these things.
> ...writing API boilerplate
Tool/function calling can be anything, it's you who decided that you can only use it for API boilerplate. Does the word "function" always mean boilerplate in progranming?