Sure it's not great at differentiating between SotA techniques, but it's very useful for sanity checks like this one.
Even for SotA models, it's still useful to verify that you can get greater than 98% accuracy on MNIST, before exploring larger, more complex bench marks.
It certainly shouldn't be the only benchmark but it's a great place to start iterating on ideas.