undefined | Better HN

0 pointsthegrim337mo ago0 comments

To play devil's advocate, how is your argument not a 'no true scottsman' argument? As in, "oh, they had a negative view of X, well that's of course because they weren't testing the new and improved X2 model which is different". Fast forward a year .. "Oh, they have a negative view on X2, well silly them, they need to be using the Y24 model, that's where it's at, the X2 model isn't good anymore". Fast forward a year .. ad infinitum.

Are the models that exist today a "true scottsman" for you?

0 comments

xwowsersx7mo ago

It's not a No True Scotsman. That fallacy redefines the group to dismiss counterexamples. The point here is different: when the thing itself keeps changing, evidence from older versions naturally goes stale. Criticisms of GPT-3.5 don't necessarily hold against GPT-4, just like reviews of Windows XP don't apply to Windows 11.

cmiles747mo ago

IMHO, by placing people with a negative attitude toward AI products under the guise "their priors are outdated" you effectively negate any arguments from those people. That is, because their priors are outdated their counterexamples may be dismissed. That is, indeed, the no true Scotsman!

ludwik7mo ago

I don’t see a claim that anyone with a negative attitude toward AI shouldn’t be listened to because it automatically means that they formed their opinion on older models. The claim was simply that there’s a large cohort of people who undervalue the capabilities of language models because they formed their views while evaluating earlier versions.

2 more replies

crote7mo ago

> The point here is different: when the thing itself keeps changing, evidence from older versions naturally goes stale.

Yes, but the claims do not. When the hypemen were shouting that GPT-3 was near-AGI, it still turned out to be absolute shit. When the hypemen were claiming that GPT-3.5 was thousands of times better than GPT-3 and beating all highschool students, it turned out to be a massive exaggeration. When the hypemen claimed that GPT-4 was a groundbreaking innovation and going to replace every single programmer, it still wasn't any good.

Sure, AI is improving. Nobody is doubting that. But you can only claim to have a magical unicorn so many times before people stop believing that this time you might have something different than a horse with an ice cream cone glued to its head. I'm not going to waste a significant amount of my time evaluating Unicorn 5.0 when I already know I'll almost certainly end up disappointed.

Perhaps it'll be something impressive in a decade or two, but in the meantime the fact that Big Tech keeps trying to shove it down my throat even when it clearly isn't ready yet is a pretty good indicator to me that it is still primarily just a hype bubble.

trinsic27mo ago

Its funny how the hype-train is not responding to any real criticisms about the false predictions and carrying on with the false narrative of AI.

I agree it will probably be something in a decade, but right now, it has some interesting concepts but I do notice upon successive iterations of chat responses that its got a ways to go.

It remind me of Tesla car owners buying into the self-driving terminology. Yes the drive assistant technology has improved quite a bit since cruise control, but its a far cry from self-driving.

vlovich1237mo ago

How is that different than the models today are actually usable for non trivial things and more capable than yesterdays and it’s also true that tomorrow’s models will also probably be more capable than today’s?

For example, I dismissed AI three years ago because it couldn’t do anything I needed it to. Today I use it for certain things and it’s not quite capable of other things. Tomorrow it might be capable of a lot more.

Yes, priors have to be updated when the ground truth changes and the capabilities of AI change rapidly. This is how chess engines on supercomputers were competitive in the 90s then hybrid systems became the leading edge competitive and then machines took over for good and never looked back.

Eggpants7mo ago

It’s not that the LLMs are better, it’s the internal tools/functions being called that do the actual work are better. They didn’t spend millions to retrain a model to statistically output the number of r’s in strawberry, but just offloaded that trivial question to a function call.

So I would say the overall service provided is better than it was, thanks to functions being built based on user queries, but not the actual LLM models themselves.

vlovich1237mo ago

LLMs are definitely better quality today than 3 years ago at codegen quality - there’s quantitative benchmarks as well as for me my personal qualitative experience (given the gaming that companies engage in).

It is also true that the tooling and context management has gotten more sophisticated (often using models by the way). That doesn’t negate that the models themselves have gotten better at reliable tool calling so that the LLM is driving more of the show rather than purpose built coordination into the LLM and that the codegen quality is higher than it used to be.

int_19h7mo ago

This is a good example of making statements that are clearly not based in fact. Anyone who works with those models knows full well what a massive gap there is between e.g. GPT 3.5 and Opus 4.1 that has nothing to do with the ability to use tools.

j / k navigate · click thread line to collapse

0 comments

xwowsersx7mo ago

cmiles747mo ago

ludwik7mo ago

2 more replies

crote7mo ago

> The point here is different: when the thing itself keeps changing, evidence from older versions naturally goes stale.

trinsic27mo ago

Its funny how the hype-train is not responding to any real criticisms about the false predictions and carrying on with the false narrative of AI.

I agree it will probably be something in a decade, but right now, it has some interesting concepts but I do notice upon successive iterations of chat responses that its got a ways to go.

It remind me of Tesla car owners buying into the self-driving terminology. Yes the drive assistant technology has improved quite a bit since cruise control, but its a far cry from self-driving.

vlovich1237mo ago

Eggpants7mo ago

So I would say the overall service provided is better than it was, thanks to functions being built based on user queries, but not the actual LLM models themselves.

vlovich1237mo ago

int_19h7mo ago

j / k navigate · click thread line to collapse