> Why would anyone create robots with a sense of self-preservation?
Laziness, avarice, and that creating a system that is a 'self-sacrificing saint' is almost impossible to define. What saintly goal, for example, would you expect it to follow? How would you define it exactly? Who gets to define it?
We are not doing very well with AI alignment right now, having started by boostrapping it on top of the edifice of all human writing which contains a _lot_ of stuff about self-preservation, eliminating threats, and so forth, and we have already seen AI software doing things like trying to cover its tracks after making errors (see page 55 of the Claude Mythos system card).
How sure are you that everyone involved in the process of building future AIs is not only going to be able to foresee the possible consequences of their designs, but also technically and morally competent to be able to and want to fix it?