OpenAI starts rolling out new voice mode (opens in new tab)

(twitter.com)

51 pointsdoubtfuluser1y ago24 comments

24 comments

23 comments · 7 top-level

mft_1y ago· 5 in thread

Related question that I was briefly pondering last night: is iOS only offering Siri anticompetitive?

Should (will?) Apple have to open up and allow other agents direct access in the same way Siri has? Either allowing user control over the backend that Siri uses for everything, or alternatively allowing other agents system access so that “hey Alexa” or “hey ChatGPT” are monitored and actioned the same as “hey Siri”?

harperlee1y ago

Not a specialist here, but my understanding is that whether a service is or not anticompetitive is subject to how its market is structured. So e.g. if you're not a dominant market player then it "is" not anticompetitive ("is" in a pragmatic way, i.e. you will not be prosecuted), whereas if you are disrupting the market it is.

All that to say that currently the market for voice assistants is nascent. No one has paid (directly) for Siri or Ok-google, and amazon can also make the case that you don't pay for Alexa, you pay for the device.

Now that OpenAI has a better service, that is also unbundled from hardware, and understandably people might want to substitute Siri, then iOS only offering Siri can start to be considered anticompetitive.

dagmx1y ago

You’re correct in that a market has to first be defined.

I also wonder if the way that Apple is integrating OpenAI into Siri now is a gambit against such a thing.

If Siri acts as a frontend to other competing products, then it likely keeps the wolves at bay so to speak.

captn3m01y ago

The relevant market might be “digital voice assistants on iOS”, similar to how recent rulings against the AppStore monopoly have used “AppStores on iOS” as the market.

pickledoyster1y ago

The things a user is able to do with a locked iOS device are so useless and going away with new updates that I've given up. I don't think a different voice module does anything here if it gets the same $swearword 'direct access' Siri has now.

For example:

1. Siri can run Shazam/Music Recognition on a locked phone, but has no Shortcut/Automation ability to copy over the song info to a native note, which is as simple as: 'Run Recognize Music; delay; Create note with $Title - $Artist in $Notebook; dismiss Siri; Stop.' If I try telling Siri to run this Shortcut, it asks me to unlock the phone. The Shortcut has 'Allow Running When Locked' set to ON. I guess Siri is limited in one of two ways: it is either unable to access Shortcuts or it can't run Music Recognition. The latter would be bonkers, since it can do that natively by me saying 'Shazam', and the former simply disrespects my settings and gives me no way of making it work.

There used to be a workaround for this: you'd first enable Voice Control (an accessibility feature that's different from Siri, I guess), tell Siri to turn Voice Control on, call it out and tell it to run the Shortcut. It runs, but asks for a passcode as soon as it has to open the Notes app. Useless.

2. I can't start a voice recording without unlocking my phone, while I am able to launch the Camera app and start a video recording, which makes no sense to me.

3. I can't turn Location Services off without unlocking, which might be one of those instances where Apple thinks it knows better (i.e. the phone's stolen or I get kidnapped and the perpetrators can't turn location off as easily), but my default is having it off and using location only when I need it (e.g., navigating in an unfamiliar location, being lost, and... that's about it).

herbst1y ago

They will just invent some arguments why none of this is possible due to their _amazing_ security model.

crimsoneer1y ago· 4 in thread

The fact we haven't seen many recordings of this "in the wild" suggests the user group so far must be really small...

sumedh1y ago

This is the only video I could find.

https://www.youtube.com/watch?v=cEeyQmR28l0

chris-orgmenta1y ago

I hope that we can smack the personality down to zero.

ChatGPT already ignores instructions such as 'questions are NEVER rhetorical, answer every question I ask directly', 'NEVER apologise or say sorry' and 'NEVER pretend to be human, and NEVER imply you have emotion or personality'.

I worry that OpenAI will make it worse from leaning into this Her stuff.

I just don't want to talk to a pseudo-human. I want to talk to the machine.

From that video: I would have wanted to complete that conversation within ten seconds, any more and it is wasting my time with its personality.

Human: "French - Pronounce croissant"

Bot: "Crossiant. Notice emphasis on nasal `iant` syllable. Crossiant"

Human: "Pronuniation of baguette"

Bot: "Baguette. Notice emphasis on second syllable. Baguette."

Any less density than that, and I feel like OpenAI doesn't respect me or my time.

1 more reply

scoot1y ago

This from the pre-roll ad on YouTube before your video impressed me more for the quality of the voice: https://artlist.io/voice-over, (but maybe I was put off with ChatGPT speaking French with an American accent)

rayxi2718281y ago

I'm not French but is it just me, or that "bien sûr" pronunciation is atrocious? Also that "parfait" at the end... urgh.

cranberryturkey1y ago· 4 in thread

This is a siri killer.

pseudocomposer1y ago

Can it integrate with mail, messages, calendar, reminders, notes, journal, etc. the same way Siri can? Otherwise it seems like a competitor to Alexa or Google Assistant at best.

mensetmanusman1y ago

Read as serial killer

wanderingstan1y ago

But didn’t Apple and openAI ink a deal recently?

jamesmalin1y ago

https://openai.com/index/openai-and-apple-announce-partnersh...

roshankhan281y ago· 2 in thread

i guess i was in the beta testing for it as i had tried it on my phone two months back. its pretty good. its only siri killer if apple gives it access to whole ios system. i use android and google keeps an eye each of my activity

sumedh1y ago

You had the standard voice mode which is free, this feature is the advanced voice mode.

flemhans1y ago

Be aware that there's another voice mode (the "headphones") that have been around for a while.

zendist1y ago· 1 in thread

Same as https://www.theverge.com/2024/7/30/24209650/openai-chatgpt-a... ? I don't have Twitter, so I'm not 100p sure.

romseb1y ago

Full tweet:

"We’re starting to roll out advanced Voice Mode to a small group of ChatGPT Plus users. Advanced Voice Mode offers more natural, real-time conversations, allows you to interrupt anytime, and senses and responds to your emotions.

Users in this alpha will receive an email with instructions and a message in their mobile app. We'll continue to add more people on a rolling basis and plan for everyone on Plus to have access in the fall. As previously mentioned, video and screen sharing capabilities will launch at a later date.

Since we first demoed advanced Voice Mode, we’ve been working to reinforce the safety and quality of voice conversations as we prepare to bring this frontier technology to millions of people.

We tested GPT-4o's voice capabilities with 100+ external red teamers across 45 languages. To protect people's privacy, we've trained the model to only speak in the four preset voices, and we built systems to block outputs that differ from those voices. We've also implemented guardrails to block requests for violent or copyrighted content.

Learnings from this alpha will help us make the Advanced Voice experience safer and more enjoyable for everyone. We plan to share a detailed report on GPT-4o’s capabilities, limitations, and safety evaluations in early August."

tomjen31y ago

Should have been rolling out months ago, but better late than never I guess.

TowerTall1y ago

Is there a place to hear what the voice(s) sounds like?

j / k navigate · click thread line to collapse

24 comments

23 comments · 7 top-level

mft_1y ago· 5 in thread

Related question that I was briefly pondering last night: is iOS only offering Siri anticompetitive?

harperlee1y ago

dagmx1y ago

You’re correct in that a market has to first be defined.

I also wonder if the way that Apple is integrating OpenAI into Siri now is a gambit against such a thing.

If Siri acts as a frontend to other competing products, then it likely keeps the wolves at bay so to speak.

captn3m01y ago

The relevant market might be “digital voice assistants on iOS”, similar to how recent rulings against the AppStore monopoly have used “AppStores on iOS” as the market.

pickledoyster1y ago

For example:

2. I can't start a voice recording without unlocking my phone, while I am able to launch the Camera app and start a video recording, which makes no sense to me.

herbst1y ago

They will just invent some arguments why none of this is possible due to their _amazing_ security model.

crimsoneer1y ago· 4 in thread

The fact we haven't seen many recordings of this "in the wild" suggests the user group so far must be really small...

sumedh1y ago

This is the only video I could find.

https://www.youtube.com/watch?v=cEeyQmR28l0

chris-orgmenta1y ago

I hope that we can smack the personality down to zero.

I worry that OpenAI will make it worse from leaning into this Her stuff.

I just don't want to talk to a pseudo-human. I want to talk to the machine.

From that video: I would have wanted to complete that conversation within ten seconds, any more and it is wasting my time with its personality.

Human: "French - Pronounce croissant"

Bot: "Crossiant. Notice emphasis on nasal `iant` syllable. Crossiant"

Human: "Pronuniation of baguette"

Bot: "Baguette. Notice emphasis on second syllable. Baguette."

Any less density than that, and I feel like OpenAI doesn't respect me or my time.

1 more reply

scoot1y ago

rayxi2718281y ago

I'm not French but is it just me, or that "bien sûr" pronunciation is atrocious? Also that "parfait" at the end... urgh.

cranberryturkey1y ago· 4 in thread

This is a siri killer.

pseudocomposer1y ago

Can it integrate with mail, messages, calendar, reminders, notes, journal, etc. the same way Siri can? Otherwise it seems like a competitor to Alexa or Google Assistant at best.

mensetmanusman1y ago

Read as serial killer

wanderingstan1y ago

But didn’t Apple and openAI ink a deal recently?

jamesmalin1y ago

https://openai.com/index/openai-and-apple-announce-partnersh...

roshankhan281y ago· 2 in thread

sumedh1y ago

You had the standard voice mode which is free, this feature is the advanced voice mode.

flemhans1y ago

Be aware that there's another voice mode (the "headphones") that have been around for a while.

zendist1y ago· 1 in thread

Same as https://www.theverge.com/2024/7/30/24209650/openai-chatgpt-a... ? I don't have Twitter, so I'm not 100p sure.

romseb1y ago

Full tweet:

Since we first demoed advanced Voice Mode, we’ve been working to reinforce the safety and quality of voice conversations as we prepare to bring this frontier technology to millions of people.

tomjen31y ago

Should have been rolling out months ago, but better late than never I guess.

TowerTall1y ago

Is there a place to hear what the voice(s) sounds like?

j / k navigate · click thread line to collapse