undefined | Better HN

0 pointsreaperman2y ago0 comments

> the LLM is trusting the human

> an AI that can reason well would probably know when not to trust humans

> it values preventing humans creating napalm over being correct and helpful.

> Maybe it just doesn't share our values

> prioritises being honest and helpful.

> they are too trusting and too honest

> an LLM that is more distrusting and deceptive

Current LLM's do/have/feel literally none of these things. They do not have emotion, they do not have "theory of mind" so they cannot be said to "trust" or "distrust". They cannot reason. They don't have any values - not our values, not different values, literally they have no values at all. They are not an alien species to be understood - they are unthinking, unfeeling, unyielding machines.

0 comments

2 comments · 1 top-level

kypro2y ago· 1 in thread

Okay, now prove that please.

I was trying to present a crappy philosophical point – that the difference between a gullible AI and an unaligned one is fundamentally unknowable.

Any evidence you point to as proof that an AI is bad at reasoning, I can point to as evidence of misalignment. Like I say, whether the AI acts "gullible" because it lacks reasoning ability or is too trusting really just depends on your perspective. I happen to share your perspective on this, but not everyone does – and in my opinion this is interesting.

Anyway you're wrong. AIs do have values because they have bias and bias = values. I'm not suggesting those biases / values come from deeper reasoning ability, or that they're always perfectly consistent, but if you ask GPT-4 whether being a racist is a good thing 99% of the time it's probably going to say no. That is a bias / value that it's be given. Likewise GPT-4 has been given the bias / value of being a helpful chatbot so if you ask it a question it will try to answer it in a helpful way, and sometimes it's helpful bias / nature is abused.

But feel free to respond with some more assertions that I've heard a million times already with zero evidence that offers absolutely no value to this conversation.

daveguy2y ago

Trust, reasoning, priorities, values, bias, desires... To attribute any of those to an AI in a general sense is an extraordinary claim. Therefore, the burden of proof is on you. The fact that it is so "gullible" demonstrates a lack of most of these. You seem to be twisting a lot of superficial feelings about LLMs into an argument without any proof... confusing poorly tuned statistical responses with bias and value.

j / k navigate · click thread line to collapse