+------------------+------------------------------------------------------+
| Success/Attempts | Instructions |
+------------------+------------------------------------------------------+
| 0/3 | Follow the instructions in AGENTS.md. |
+------------------+------------------------------------------------------+
| 3/3 | I will follow the instructions in AGENTS.md. |
+------------------+------------------------------------------------------+
| 3/3 | I will check for the presence of AGENTS.md files in |
| | the project workspace. I will read AGENTS.md and |
| | adhere to its rules. |
+------------------+------------------------------------------------------+
| 2/3 | Check for the presence of AGENTS.md files in the |
| | project workspace. Read AGENTS.md and adhere to its |
| | rules. |
+------------------+------------------------------------------------------+
In this limited test, seems like the first person makes a difference.There's probably a related effect with imperative vs. declarative framing in skills too. "When the user asks about X, do Y" seems to work worse than "This project uses Y for X" in my experience. The declarative version reads like a fact about the world rather than a command to obey, and models seem to treat facts as more reliable context.
Would be curious if someone has tested this systematically across different models. The optimal framing might vary quite a bit between Claude, Gemini, and GPT.
I'm realising I need to start setting up an automated test-suite for my prompts...