undefined | Better HN

0 pointsteruakohatu5mo ago0 comments

> 1. The nerf is psychologial, not actual. 2. The nerf is real but in a way that is perceptual to humans, but not benchmarks.

They could publish weekly benchmarks. To disprove. They almost certainly have internal benchmarking.

The shift is certainly real. It might not be model performance but contextual changes or token performance (tasks take longer even if the model stays the same).

0 comments

ChadNauseam5mo ago

Anyone can publish weekly benchmarks. If you think anthropic is lying about not nerfing their models you shouldn't trust benchmarks they release anyway.

teruakohatuOP5mo ago

I never said they were lying. They haven’t stated that they do not tweak compute, and we know the app is updated regularly.

j / k navigate · click thread line to collapse

0 comments

ChadNauseam5mo ago

Anyone can publish weekly benchmarks. If you think anthropic is lying about not nerfing their models you shouldn't trust benchmarks they release anyway.

teruakohatuOP5mo ago

I never said they were lying. They haven’t stated that they do not tweak compute, and we know the app is updated regularly.

j / k navigate · click thread line to collapse