This is a common misunderstanding. Transformers are actually Turing complete:
* On the Turing Completeness of Modern Neural Network Architectures, https://arxiv.org/abs/1901.03429
* On the Computational Power of Transformers and its Implications in Sequence Modeling, https://arxiv.org/abs/2006.09286