One thing missing from most RL environment discussions is observability during training. Single-agent envs are hard enough to debug, but multi-agent environments are a completely different challenge, reward curves tell you almost nothing about which agent failed or why cooperation broke down.
I'm guessing their issue isn't about the vowel, it's about the number mismatch between the singular article "an" and the plural noun phrase "frequently asked questions".