Numba is also really nice, and for just compiling against arrays I like it better than Cython. Neither help if you need data structures or abstractions, though.
Though as I said before, sometimes no amount of c++/c escape hatches can improve performance since you have to use python objects at some point or another and that will be the bottleneck. But by then, you won't needing some of the stuff Julia offers like the REPL and notebooks etc.
I haven't used Julia a lot but to me it's in a weird spot where it would be ideal to start projects with in theory since you won't need to outgrow the language you are starting with, since it's fast enough and has a pretty good/maintainable/sane design. But then you are sacrificing so much and will need much more time to get started that you might not ever get to that point anyways.
Just as an example, debugging obscure problems or deploying pytorch models that are more custom in prod is already pretty daunting at times, and it's the "best" and most popular ML framework in the world. I can't imagine how much more time consuming it would be when using a much smaller/less used library/framework.
So yeah all of that to say that being way faster isn't how Julia will win. Maybe a push from an influent player/big tech might give it the momentum it needs.