undefined | Better HN

Skip to content

Top Best Ask Show New Jobs

0 pointsraincole6mo ago0 comments

What example do you need? In every single benchmark AI is getting better and better.

Before someone says "but benchmark doesn't reflect real world..." please name what metric you think is meaningful if not benchmark. Token consumption? OpenAI/Anthropic revenue?

0 comments

5 comments · 5 top-level

jacobsenscott6mo ago

Whenever I try and use a "state of the art" LLM to generate code it takes longer to get a worse result than if I just wrote the code myself from the start. That's the experience of every good dev I know. So that's my benchmark. AI benchmarks are BS marketing gimmicks designed to give the appearance of progress - there are tremendous perverse financial incentives.

This will never change because you can only use an LLM to generate code (or any other type of output) you already know how to produce and are expert at - because you can never trust the output.

fzeroracer6mo ago

AI is getting better at every benchmark. Please ignore that we're not allowed to see these benchmarks and also ignore that the companies in question are creating the benchmarks that are being exceeded.

azemetre6mo ago

What metrics, that aren't controlled by industry, show AI getting better? Generally curious because those "ranking sites" to me seem to be infested with venture capital, so hardly fair or unbiased. The only reports I hear from academia are those being overly negative on AI.

bluefirebrand6mo ago

> please name what metric you think is meaningful

Job satisfaction and human flourishing

By those metrics, AI is getting worse and worse

philipwhiuk6mo ago

OpenAI net profit.

The figures for cost are wildly off to start with.

j / k navigate · click thread line to collapse