undefined | Better HN

0 pointsthemgt6d ago0 comments

That’s not what your quotes said. They said bigger models = plateau in intelligence, nothing about more data or increased hallucinations ... I’m pretty sure #1 is well known

Well known in a multiverse branch where Fable was a dud?

0 comments

3 comments · 1 top-level

an0malous6d ago· 2 in thread

No, well known in the current multiverse branch where we still occasionally use things like math and scientific analysis instead of people’s vibe checks and pelican SVGs.

Here’s the paper from OpenAI where Dario himself was a co-author: https://arxiv.org/pdf/2001.08361

> We have observed consistent scalings of language model log-likelihood loss with non-embedding parameter count N, dataset size D, and optimized training computation Cmin, as encapsulated in Equations (1.5) and (1.6). Conversely, we find very weak dependence on many architectural and optimization hyperparameters. Since scalings with N,D,Cmin are power-laws, there are diminishing returns with increasing scale.

themgtOP6d ago

instead of people’s vibe checks and pelican SVGs.

Right, what happened is everyone went to Fable and asked it to make the very best bicycle pelican SVG, no mistakes. And Fable's bicycle pelican SVGs were such timeless masterpieces, we all instantly got AI psychosis. Happily, you were immune to this.

coldtea5d ago

j / k navigate · click thread line to collapse