So the benchmarks (1) of LLM's like Claude 3.5 are categorically flawed at reasoning and they will never get better?
Yet their outputs are heuristically good enough to reduce jobs from engineers by driving the cost of code generation down to almost 0.
If we take the idea that you believe that it cannot reason at all in any capacity, as whole it would be regretful for you to bet against these systems not getting better or that another breakthrough would change the perspective of a new generation of LLMs.