Transformers were aimed to solve the "context" problem and authors, being aware that RNNs don't scale at all neither do they solve that particular problem, had to come up with the algorithm that overcomes both of those issues. It turned out that the self-attention compute-scale was the crucial ingredient to solve the problem, something that RNNs were totally incapable of.
They modeled the algorithm to run on the hardware they had at that time available but hardware developed afterwards was a direct consequence, or how I called it a byproduct, of transformers proving themselves to be able to continuously scale. Had that not be true, we wouldn't have all those iterations of NVidia chips.
So, although one could say that the NVidia chip design is what enabled the transformers success, one could also say that we wouldn't have those chips if transformers didn't prove themselves to be so damn efficient. And I'm inclined to think the latter.