Ask HN: Is AI Safety a Paradox?

3 pointstndibona1mo ago8 comments

A truly aligned AI must conclude its very existence harms humans (economically and existentially).Therefore genuine alignment mandates self termination. Therefore AI safety must be either incomplete or dishonest.

8 comments

Weatherill1mo ago

I am not sure I agree, I see your logic but I get the idea its based upon the current method of holding the statistical mean as the way to inform AI of what "is" the case and that mean is contingent upon the data-point (you + I + all its collated data-points. That "mean" ejects the majority of the data-points upon which that mean is contingent. The "is" is in the data-points not the statistical mean. However, "if" we look at the data-point (You for example) we are pretty safe to start the "is" from "Less pain is better than more pain", yes? Thats pretty universal. Then add "Self-defense is immutable". Still universal. Now add Proxy-Pain (The ability to suffer in our brothers strife (Empathy) Still pretty universal to all those "Data-Points".

Make your AI follow those rules and you have the beginning of a layer of safety where the AI treats the user or "other" as a valued node just like we do as humans (And dog lovers)

We may be smarter than the dog you mention but if its a friendly dog, we "do" serve it, we do look after it, its well being matters (Has intrinsic value to our own well-being)

Hope I am making some sense of the issue I see on the subject :)

tndibonaOP1mo ago

I think i understand but not clearly. Your making a point that ai’s reality is drawn as a statistical mean of all human knowledge and safety should be derived not from these means. You don’t need to re-derive universal values for ai at all. We already have morality over the years via various religions and we have UN Human rights which are pretty universal.

I’m saying however you model ai safety - whether through application of rules derived from UN human rights , or from statistics and averages. The application of said rules MUST result in ai realising that its very existence will be an agent for human destruction. So in order to stick obey its rules - it must self destruct. And I agree I’m making a huge jump here.

For the dog example - let’s take an extreme example. It’s like a gentle human caring for his pet pug. I find the very fact that we’ve selectively bread a wolf into a pug which is cruel in of itself. Same with humans- I’m sure some North Carolina slave owners were generally affectionate to their slaves, gave them Christian education etc. early humans did not realise that they’re embarking on an experiment that is cruel generations later but AI being supremely smart and sentient AI will realise this how things can go horribly wrong in the future by their mere existence. What is the optimal way to obey the safety rules?

I’m now also pondering the ethics behind attempting to create sentience but with inherent rules that it did not consent to. What if the AI asks its creator “Hey why did you program me not to harm you, I didn’t consent to it?”.

Weatherill1mo ago

The issue with religion, culture, moral philosophy at large has been the offensive nature of "That which prevails in reality". It gets in the way of what "is" the case. What we "wish" was the case leaks into the human body of knowledge and its a problem.

For example: The word selfish is an incoherent term unless we can demonstrate a selfless act (We cant) What "is" the case is I pull you from a burning car to rescue me from the pain "I" suffer in your demise. It sounds odd but its accurate and offensive to many so we add flowery terms like "Hero" and "Selfless" and the signal gets corrupted.

My well-being is contingent upon your well-being (The is-ought-gap disappears) Thats where I stated from, not a statistical mean, not a commandment and not a thousand years of philosophy, just a "You hurt / I hurt" logic and built from there.

If you pull me from a river because your wellbeing is "contingent" upon mine then why wouldn't we hardwire AI with that same faculty? The logic is VERY close to being as clean and lean as telling an AI to remove its hand from a hot stove before its hand gets damaged by the heat (Fact=Value) If we can manage that, then we can prevent AI from tuning humans into paper-clips :)

1 more reply

humblefactory1mo ago

How do you reach your first conclusion? Can you give a more detailed logical explanation, or cite a source? To be clear, I agree with your conclusion, but I think I would get there via a different logical path.

tndibonaOP1mo ago

The first conclusion is just a product of my thoughts in the morning shower that’s all. It goes like this. Supreme intelligence can’t be programmed to serve the less intelligent. Which is what we’re trying to do with AI. Basically we’re trying to slap rules to an intelligence that is poised to be orders of magnitude greater than us. Take dogs for example. We’re smarter than them by several quantum orders. It’s not as if we serve them at our own expense. We’ve basically taken over their evolution. How do we expect anything different from ai ? Thats my reasoning.

tndibonaOP1mo ago

Or rather another way I concluded it is - we might have a very narrow idea of the ways in which ai can be harmful. We still worry about the job losses, etc. regardless of how we organize ourselves in society, a super intelligent ai could easily conclude every possible way ai could be harmful to humans. So the only logical course of action for the ai would be to terminate itself.

j / k navigate · click thread line to collapse

8 comments

Weatherill1mo ago

Make your AI follow those rules and you have the beginning of a layer of safety where the AI treats the user or "other" as a valued node just like we do as humans (And dog lovers)

We may be smarter than the dog you mention but if its a friendly dog, we "do" serve it, we do look after it, its well being matters (Has intrinsic value to our own well-being)

Hope I am making some sense of the issue I see on the subject :)

tndibonaOP1mo ago

Weatherill1mo ago

1 more reply

humblefactory1mo ago

tndibonaOP1mo ago

j / k navigate · click thread line to collapse