undefined | Better HN

0 pointsbearjaws2y ago0 comments

OAI just made an embarrassment of Google's fake demo earlier this year. Given how this was recorded, I am pretty certain it's authentic.

0 comments

12 comments · 4 top-level

hehdhdjehehegwv2y ago· 5 in thread

This feature has been in iOS for a while now, just really slow and without some of the new vision aspects. This seems like a version 2 for me.

bigyikes2y ago

That old feature uses Whisper to transcribe your voice to text, and then feeds the text into the GPT which generates a text response, and then some other model synthesizes audio from that text.

This new feature feeds your voice directly into the GPT and audio out of it. It’s amazing because now ChatGPT can truly communicate with you via audio instead of talking through transcripts.

New models should be able to understand and use tone, volume, and subtle cues when communicating.

I suppose to an end user it is just “version 2” but progress will become more apparent as the natural conversation abilities evolve.

mucle62y ago

Does it feed your audio directly to gpt4?To test it I said in a very angry tone "WHAT EMOTION DOSE IT SOUND LIKE I FEEL RIGHT NOW?" and it said it didn't know because we are communicating over text

hehdhdjehehegwv2y ago

Yes, per my other comment this is an improvement on what their app already does. The magnitude of that improvement remains to be seen, but it isn’t a “new” product launch like a search engine would be.

abhpro2y ago

No it's not the same thing, the link for this submission even explains that. Anyone who comments should at least give the submission a cursory read.

hehdhdjehehegwv2y ago

I did and regardless of the underlying technology it is, in fact, an improvement to an existing product - not something new from whole cloth.

If they had released a search engine, which had been suggested, that would be a new product.

readams2y ago· 2 in thread

https://twitter.com/Google/status/1790055114272612771

ianbicking2y ago

This demo feels a lot like GPT-V. Like they've gotten a lot of the latencies down, but it's doing the same thing GPT was doing previously with transcription after silence detection and TTS of the output.

sumedh2y ago

Is there a reason why Open AI and Google have events so close to each other?

CivBase2y ago· 1 in thread

I don't doubt this is authentic, but if they really wanted to fake those demos, it would be pretty easy to do using pre-recorded lines and staged interactions.

mike006322y ago

For what it's worth, OpenAI also shared videos of failed demos:

https://vimeo.com/945591584

I really value how open they are being about its limitations.

nojvek2y ago

Let OAI actually be released to the masses. Then we can compare.

I’m not a big fan of announcing something but it not being released.

They say available for api but it’s text only. Can’t send audio stream to get audio stream back.

Time will tell. I’m holding my emotions after I get my hands on it.

j / k navigate · click thread line to collapse

0 comments

12 comments · 4 top-level

hehdhdjehehegwv2y ago· 5 in thread

This feature has been in iOS for a while now, just really slow and without some of the new vision aspects. This seems like a version 2 for me.

bigyikes2y ago

That old feature uses Whisper to transcribe your voice to text, and then feeds the text into the GPT which generates a text response, and then some other model synthesizes audio from that text.

This new feature feeds your voice directly into the GPT and audio out of it. It’s amazing because now ChatGPT can truly communicate with you via audio instead of talking through transcripts.

New models should be able to understand and use tone, volume, and subtle cues when communicating.

I suppose to an end user it is just “version 2” but progress will become more apparent as the natural conversation abilities evolve.

mucle62y ago

Does it feed your audio directly to gpt4?To test it I said in a very angry tone "WHAT EMOTION DOSE IT SOUND LIKE I FEEL RIGHT NOW?" and it said it didn't know because we are communicating over text

hehdhdjehehegwv2y ago

abhpro2y ago

No it's not the same thing, the link for this submission even explains that. Anyone who comments should at least give the submission a cursory read.

hehdhdjehehegwv2y ago

I did and regardless of the underlying technology it is, in fact, an improvement to an existing product - not something new from whole cloth.

If they had released a search engine, which had been suggested, that would be a new product.

readams2y ago· 2 in thread

https://twitter.com/Google/status/1790055114272612771

ianbicking2y ago

sumedh2y ago

Is there a reason why Open AI and Google have events so close to each other?

CivBase2y ago· 1 in thread

I don't doubt this is authentic, but if they really wanted to fake those demos, it would be pretty easy to do using pre-recorded lines and staged interactions.

mike006322y ago

For what it's worth, OpenAI also shared videos of failed demos:

https://vimeo.com/945591584

I really value how open they are being about its limitations.

nojvek2y ago

Let OAI actually be released to the masses. Then we can compare.

I’m not a big fan of announcing something but it not being released.

They say available for api but it’s text only. Can’t send audio stream to get audio stream back.

Time will tell. I’m holding my emotions after I get my hands on it.

j / k navigate · click thread line to collapse