undefined | Better HN

0 pointsfvdessen1y ago0 comments

The demo is impressive but personally, as a commercial user, for my practical use cases, the only thing I care about is how smart it is, how accurate are its answers and how vast is its knowledge. These haven’t changed much since GPT-4, yet they should, as IMHO it is still borderline in its abilities to be really that useful

0 comments

CapcomGo1y ago

But that's not the point of this update

fvdessenOP1y ago

I know, and I know my comment is dismissive of the incredible work shown here, as we’re shown sci-fi level tech. But I feel I have this kettle, that boils water in 10min, and it really should boil it in 1, but instead is now voice operated.

I hope the next version delivers on being smarter, as this update instead of making me excited, makes me feel they’ve reached a plateau on the improvement of the core value and are distracting us with fluff instead

shepherdjerred1y ago

Everything is amazing & Nobody is happy: https://www.youtube.com/watch?v=PdFB7q89_3U

1 more reply

hombre_fatal1y ago

Sure, but "not enough, I want moar" is a trivial demand. So trivial that it goes unsaid.

1 more reply

throwthrowuknow1y ago

Watch the last few minutes of that linked video, Mira strongly hints that there’s another update coming for paid users and seems to make clear that GPT4o is moreso for free tier users (even though it is obviously a huge improvement in many features for everyone).

jll291y ago

There is room for more than one use case and large language model type.

I predict there will be a zoo (more precisely tree, as in "family tree") of models and derived models for particular application purposes, and there will be continued development of enhanced "universal"/foundational models as well. Some will focus on minimizing memory, others on minimizing pre-training or fine-tuning energy consumption, some need high accuracy, others hard realtime speed, yet others multimodality like GPT4.o, some multilinguality, and so on.

Previous language models that encoded dictionaries for spellcheckers etc. never got standardized (for instance, compare aspell dictionaries to the ones from LibreOffice to the language model inside CMU PocketSphinx) so that you could use them across applications or operating systems. As these models are becoming more common, it would be interesting to see this aspect improve this time around.

https://www.rev.com/blog/resources/the-5-best-open-source-sp...

CooCooCaCha1y ago

I disagree, transfer learning and generalization are hugely powerful and specialized models won't be as good because their limited scope limits their ability to generalize and transfer knowledge from one domain to another.

I think people who emphasis specialized models are operating under a false assumption that by focusing the model it'll be able to go deeper in that domain. However, the opposite seems to be true.

Granted, specialized models like AlphaFold are superior in their domain but I think that'll be less true as models become more capable at general learning.

whyever1y ago

They say it's twice as fast/cheap, which might matter for your use case.

minimaxir1y ago

It's twice as fast/cheap relative to GPT-4-turbo, which is still expensive compared to GPT-3.5-turbo and Claude Haiku.

https://openai.com/api/pricing/

BeetleB1y ago

For commercial use at scale, of course cost matters.

For the average Joe programmer like me, GPT4 is already "dirt cheap". My typical monthly bill is $0-3 using it as much as I like.

The one time it was high was when I had it take 90+ hours of Youtube video transcripts, and had it summarize each video according to the format I wanted. It produced about 250 pages of output.

That month I paid $12-13. Well worth it, given the quality of the output. And now it'll be less than $7.

For the average Joe, it's not expensive. Fast food is.

c0t3001y ago

but better afaik

1 more reply

fvdessenOP1y ago

I’d much rather have it be slower, more expensive, but smarter

specproc1y ago

Depends what you want it for. I'm still holding out for a decent enough open model, Llama 3 is tantalisingly close, but inference speed and cost are serious bottlenecks for any corpus-based use case.

abdullin1y ago

I think, that might come with the next GPT version.

OpenAI seems to build in cycles. First they focus on capabilities, then they work on driving the price down (occasionally at some quality degradation)

pests1y ago

Then the current offering should suffice, right?

ben_w1y ago

I understand your point, and agree that it is "borderline" in its abilities — though I would instead phrase it as "it feels like a junior developer or an industrial placement student, and assume it is of a similar level in all other skills", as this makes it clearer when it is or isn't a good choice, and it also manages expectations away from both extremes I frequently encounter (that it's either Cmdr Data already, or that's it's a no good terrible thing only promoted by the people who were previously selling Bitcoin as a solution to all the economics).

That said, given the price tag, when AI becomes genuinely expert then I'm probably not going to have a job and neither will anyone else (modulo how much electrical power those humanoid robots need, as the global electricity supply is currently only 250 W/capita).

In the meantime, making it a properly real-time conversational partner… wow. Also, that's kinda what you need for real-time translation, because: «be this, that different languages the word order totally alter and important words at entirely different places in the sentence put», and real-time "translation" (even when done by a human) therefore requires having a good idea what the speaker was going to say before they get there, and being able to back-track when (as is inevitable) the anticipated topic was actually something completely different and so the "translation" wasn't.

fvdessenOP1y ago

I guess I feel like I’ll get to keep my job a while longer and this is strangely disappointing…

A real time translator would be a killer app indeed, and it seems not so far away, but note how you have to prompt the interaction with ‘Hey ChatGPT’; it does not interject on its own. It is also unclear if it is able to understand if multiple people are speaking and who’s who. I guess we’ll see soon enough :)

ben_w1y ago

> It is also unclear if it is able to understand if multiple people are speaking and who’s who. I guess we’ll see soon enough :)

Indeed; I would be pleasantly surprised if it can both notice and separate multiple speakers, but only a bit surprised.

Keyframe1y ago

One thing I've noticed, is the more context and more precise the context I give it the "smarter" it is. There are limits to it of course. But, I cannot help but think that's where next barrier will be brought down. An agent or multiple of that tag along with everything I do throughout the day to have the full context. That way, I'll get smarter and more to the point help as well as not spending much time explaining the context.. but, that will open a dark can that I'm not sure people will want to open - having an AI track everything you do all the time (even if only in certain contexts like business hours / env).

coffeebeqn1y ago

There are definitely multiple dimensions these things are getting better in. The popular focus has been on the big expensive training runs but inference , context size, algorithms, etc are all getting better fast

abdullin1y ago

I have a few LLM benchmarks that were extracted from real products.

GPT-4o got slightly better overall. Ability to reason improved more than the rest.

RupertEisenhart1y ago

Its faster, smarter and cheaper over the API. Better than a kick in the teeth.

j / k navigate · click thread line to collapse

0 comments

CapcomGo1y ago

But that's not the point of this update

fvdessenOP1y ago

shepherdjerred1y ago

Everything is amazing & Nobody is happy: https://www.youtube.com/watch?v=PdFB7q89_3U

1 more reply

hombre_fatal1y ago

Sure, but "not enough, I want moar" is a trivial demand. So trivial that it goes unsaid.

1 more reply

throwthrowuknow1y ago

jll291y ago

There is room for more than one use case and large language model type.

https://www.rev.com/blog/resources/the-5-best-open-source-sp...

CooCooCaCha1y ago

I think people who emphasis specialized models are operating under a false assumption that by focusing the model it'll be able to go deeper in that domain. However, the opposite seems to be true.

Granted, specialized models like AlphaFold are superior in their domain but I think that'll be less true as models become more capable at general learning.

whyever1y ago

They say it's twice as fast/cheap, which might matter for your use case.

minimaxir1y ago

It's twice as fast/cheap relative to GPT-4-turbo, which is still expensive compared to GPT-3.5-turbo and Claude Haiku.

https://openai.com/api/pricing/

BeetleB1y ago

For commercial use at scale, of course cost matters.

For the average Joe programmer like me, GPT4 is already "dirt cheap". My typical monthly bill is $0-3 using it as much as I like.

The one time it was high was when I had it take 90+ hours of Youtube video transcripts, and had it summarize each video according to the format I wanted. It produced about 250 pages of output.

That month I paid $12-13. Well worth it, given the quality of the output. And now it'll be less than $7.

For the average Joe, it's not expensive. Fast food is.

c0t3001y ago

but better afaik

1 more reply

fvdessenOP1y ago

I’d much rather have it be slower, more expensive, but smarter

specproc1y ago

Depends what you want it for. I'm still holding out for a decent enough open model, Llama 3 is tantalisingly close, but inference speed and cost are serious bottlenecks for any corpus-based use case.

abdullin1y ago

I think, that might come with the next GPT version.

OpenAI seems to build in cycles. First they focus on capabilities, then they work on driving the price down (occasionally at some quality degradation)

pests1y ago

Then the current offering should suffice, right?

ben_w1y ago

fvdessenOP1y ago

I guess I feel like I’ll get to keep my job a while longer and this is strangely disappointing…

ben_w1y ago

> It is also unclear if it is able to understand if multiple people are speaking and who’s who. I guess we’ll see soon enough :)

Indeed; I would be pleasantly surprised if it can both notice and separate multiple speakers, but only a bit surprised.

Keyframe1y ago

coffeebeqn1y ago

abdullin1y ago

I have a few LLM benchmarks that were extracted from real products.

GPT-4o got slightly better overall. Ability to reason improved more than the rest.

RupertEisenhart1y ago

Its faster, smarter and cheaper over the API. Better than a kick in the teeth.

j / k navigate · click thread line to collapse