Google similarly has a safe content filter. The contention is that the chatbot safe content filter that is supposedly on is not encapsulating some significant cases.
But LLMs need to programmatically understand what dynamic content is appropriate and what’s not which is a much harder problem. And people are reporting on just how hard a problem that is by demonstrating vulnerabilities.
The chatbot says it has explicit rules that prevent it from sharing harmful content, but then it does it anyway.
It would be more akin to Google blacklisting a site and then someone exposing that the site can still be found via Google search.