1I ran 3,360 safety tests on GPT-4o, Claude, Grok, DeepSeek, Gemini (opens in new tab)(github.com)4aestrad71d ago6