Can you explain why three LLM being able to identify that the issue proves that it was prompted by a human? The major reason we do multi-agent orchestration is that self-reflection mechanisms within a single agent are much weaker than self-reflection between different agents. It seems completely plausible that an LLM could produce output that a separate process wouldn't agree with.