Or more to the point, “this picture contains copyrighted material, which must be censored because…”
etc.
I tried to be as general as possible.
The training data for a self-censorship neural network could be as robust as any given society would like.
An algorithm based on self-censorship of generated output wouldn’t require censorship within the training data for the generative neural network.
I can imagine some other advantages to that approach.