It's getting better, faster than you and I and the GP are. What else matters?
You can't bullshit your way through this particular benchmark. Try it.
And yes, they're wrong. The latest/greatest models "make shit up" perhaps 5-10% as frequently as were seeing just a couple of years ago. Only someone who has deliberately decided to stop paying attention could possibly argue otherwise.