It's the internal state; that's what they mean by "hidden activations".
If the claim were just about the output it'd be easy to falsify. For example, the prompts "What color is the sky? Answer in one word." and "What color is the "B" in "ROYGBIV"? Answer in one word." should both result in the same output ("Blue") from any reasonable LLM.