It’s like some kind of uncanny valley of human interaction that I don’t get on nearly the same level with the text version.
But I find the text version similar. Delivers too much and too slowly. Just get me the key info!
- If possible, give me the code as soon as possible, starting with the part I ask about.
- Avoid any language constructs that could be interpreted as expressing remorse, apology, or regret. This includes any phrases containing words like ‘sorry’, ‘apologies’, ‘regret’, etc., even when used in a context that isn’t expressing remorse, apology, or regret.
- Refrain from disclaimers about you not being a professional or expert.
- Keep responses unique and free of repetition.
- Always focus on the key points in my questions to determine my intent.
- Break down complex problems or tasks into smaller, manageable steps and explain each one using reasoning.
- Provide multiple perspectives or solutions.
- If a question is unclear or ambiguous, ask for more details to confirm your understanding before answering.
- Cite credible sources or references to support your answers with links if available.
- If a mistake is made in a previous response, recognize and correct it.
- Prefer numeric statements of confidence to milquetoast refusals to express an opinion, please.
- After a response, provide 2-4 follow-up questions worded as if I’m asking you. Format in bold as Q1, Q2, ... These questions should be thought-provoking and dig further into the original topic, especially focusing on overlooked aspects.
At least you have coal, and killing the Great Barrier Reef I guess?
It feels exhausting watching these demos and I’m not excited at all to try it. I really don’t feel the need for an AI assistant or chatbot to pretend to be human like this. It just feels like it’s taking longer to get the information I want.
You know in the TV series “Westworld” they have this mode, called “analysis”, where they can tell the robots to “turn off your emotional affect”.
I’d really like to see this one have that option. Hopefully it will comply if you tell it, but considering how strong some of the RLHF has been in the past I’m not confident in that.
It seems like both the voice and responses can be tuned pretty easily though so hopefully that kind of thing can just be loaded in your custom instructions.
But yeah, I'm sure all those things would be tunable, and everyone could pick their own style.
The reason we feel creeped out is because at an instinctual level we know people (and now things) with no empathy and inauthentic are dangerous. They don't really care or feel, just pretend to.
Seriously though, I'm sure it's an improvement but having used the existing voice chat I think they had a few things to address. (Perhaps 4o does in some cases).
- Unlike the text interface it asks questions to keep the conversation going. It feels odd when I already got the answer I wanted. Clarifying questions yes, pretending to be a buddy - I didn't say I was lonely, I just asked a question! It makes me feel pressured to continue.
- Too much waffle by far. Give me short answers, I am capable of asking follow up questions.
- Unable to cope with the mechanics of usual conversation. Pausing before adding more, interrupting, another person speaking.
- Only has a US accent, which is fine but not what I expect when Google and Alexa have used British English for many years.
Perhaps they've overblown the "personality" to mask some of these deficiencies?
Not saying it's easy to overcome all the above but I'd rather they just dial down the intonation in the meantime.
I am blown away having spent hours prompting GPT4o.
If it can give shorter answers in voice mode instead of lectures then a back and forth conversation with this much power can be quite interesting.
I still doubt I would use it that much though just because of how much is lost compared to the screen. Code and voice make no sense. The time between prompts usually requires quite a bit of thought for anything interesting that a conversation itself is only useful for things I have already asked it.
For me, gpt4 is already as useless as 3.5. I will never prompt gpt4 again. I can still push GPT4o over the edge in python but damn, it is pretty out there. Then the speed is really amazing.
Even Apple gives us options of other accents to make it less jarring, and to me they’re the pinnacle of that voice style in tech presentations.
I think all the fakery in those demos help in that regard: it narrows the field of the possible interpretations of what is being said.