But since python runtime is written in C, the issue can't be Python vs C.
> However, python by default has a small offset when reading memories while lower level language (rust and c)
Yet if the runtime is made with C, then that statement is incorrect.
The point is not that one language is faster than another. The point is that the default way to implement something in a language ended up being surprisingly faster when compared to other languages in this specific scenario due to a performance issue in the hardware.
In other words: on this specific hardware, the default way to do this in Python is faster than the default way to do this in C and Rust. That can be true, as Python does not use C in the default way, it adds an offset! You can change your implementation in any of those languages to make it faster, in this case by just adding an offset, so it doesn't mean that "Python is faster than C or Rust in general".