4 here as well. I get similar results when using the API directly, though without a "system" role message.
LLMs are, naturally, non-deterministic. Reducing the temperature in your guardrail calls can reduce that a bit, but the lesson learned from the "working" and "non-working" attempts is this: the guardrails are "predictably failing in unpredictable ways" (if I may coin a phrase).