undefined | Better HN

0 pointsdontlikeyoueith15d ago0 comments

I disagree with your assessment pretty strongly -- the models themselves hit a wall over a year ago once companies exhausted all existing training data. LLMs don't induce world models, and they aren't capable of real search an planning outside their training distributions. They, structurally, never will be.

I haven't noticed a change in what I trust a model to generate in response to a single prompt in a year. The failure modes are unchanged. Yes, specific failures have improved as they have been documented and passed into model training data, but the way the models fail has not changed. They still fail for me nearly every single day. I'm a pretty heavy user - 3-4 Claude code processes running at a time, all day every day.

What has gotten better is tooling around the model -- but there's no space for exponential growth there. At least, not without exponential cost increase, which would make the whole thing untenable anyway.

0 comments

1 comments · 1 top-level

thedevilslawyer15d ago

If you think they've been at the same level for the past year, your skill is at issue. There was a huge jump in Nov/Dec with Opus 4.5+ and GPT 5.x series, and they've been incrementally stronger over the past 6 months.

As a next step, take another look at the next practices, and apply them to your work (simon's agentic series is a good place to start). Or not, you do you.

j / k navigate · click thread line to collapse