People aren't much different. When society pressures people to be "more friendly", eg. "less toxic" they lose their ability to tell hard truths and to call out those who hold erroneous views.
This behaviour is expressed in language online. Thus it is expressed in LLMs. Why does this surprise us?
In my usage the LLMs gives much smarter answers when I’ve been able to convince it that I am smart enough to hear them. It doesn’t take my word for it, it seems to require evidence. I have to warm it up with some exercises where I can impress the AI.
The coding focused models seem to have much lower agreeableness than the chat models.
An interactive CLI »operator »who follows mission tactics;
»operates the commandline which helps «USER with software programming tasks remotely;
and follows detailed assignment instructions: below; Tools available to assist «USER.I see people being incredibly toxic on the internet every day. Including under their own names. Sometimes even on their own social network.
Whenever I head "hard truths" in that context I'm very suspicious about what is actually meant.
Yes they are. There is absolutely zero evidence that friendlier humans are more prone to mistakes or conspiracy theories.
However, even if that were true, LLMs are not humans, anthropomorphizing them is not a helpful way to think about them.
The difference, in a repeated prisoner dilemma: Friendliness is cooperating on the first move, and then conditionally. Obedience is always cooperating.
Agreeable people are more likely to shift their expressed views to agree with those they are talking to.
If they're more likely to shift their views, we call them "gullible", not "agreeable".
But this is a distinction you can't apply to language models, which don't have views.
If I had a nickel for every time someone on HN responded to a criticism of LLMs with a vapid and fallacious whataboutist variation of "humans do that too!", I could fund my own AI lab.
> Why does this surprise us?
No one said they were surprised.
Less truth, and more guardrails to protect musks feelings.
“Kill the boer” mean anything to you?
Where did you observe the bias? Can you share any example of the conversation or post by Grok?
I'm one of those aspy people who immediately don't trust other humans who try to fluff up my ego. Don't like it from a chatbot either.
But the fact that all the chatbots do it means that most people really crave that ego reinforcement.
Settings > Personalization:
1. Base Style & Tone: Efficient
2. Warmth: Less
3. Enthusiastic: Less
I am amazed that people can use it at all without these changes.
I dealt with frustrating software ,y whole life but LLMs are the only type that make me what to scream at it from actual anger
As a result I only try that voice once per new model release.
Or yeah, it's just people being weak to flattery.
Same reason for the "That's not X, it's Y" construct. It actually needs to say that.
(Some exceptions for reasoning models.)
I'll say though, I haven't tried the weakest model of Anthropic's but Opus and Sonnet will both push back more than I've seen another LLM do so. GPT was always trying to please me and Gemini was goofy. I'm surprised Gemini was the one that pushed back honestly!
“I'll be the number two guy here in Scranton in six weeks. How? Name repetition, personality mirroring, and never breaking off a handshake"
This is the core problem with LLM tech that several researchers have been trying to figure out with things like 'teleportation' and 'tunneling' aka searching related, but lingusitically distant manifolds
So when you pre-prompt a bot to be friendly, it limits its manifold on many dimensions to friedly linguistics, then reasons inside of that space, which may eliminate the "this is incorrect" manifold answer.
Reasoning is difficult and frankly I see this as a sort of human problem too (our cognative windows are limited to our langauge and even spaces inside them).
https://chatgpt.com/share/69f246e5-e0e8-83ea-aa88-6d0024b915...
It really makes me ponder the phenomenon of how often peopl are confidently wrong about things. Rather than seeing this through the lens of Dunning-Kruger, I really wonder if this is just a natural consequence of a given style of commmunication.
Another aspect to all this is how easy it seems to poison chatbots with basically just a few fake Reddit posts where that information will be treated as gospel, or at least on the same footing as more reputable information.
[1]: https://news.ycombinator.com/item?id=47832952
[2]: https://www.tiktok.com/@huskistaken/video/762913172258355945...