In practice, it ends up being an AI that won't do the former for your average person but can still be prompt-engineered to do the latter by a sufficiently determined attacker.
Probably. But AI alignment research is currently extremely primitive (I think they describe themselves as “pre-paradigmatic”), so giving them some time to find a better way is at least worth trying.