Example: in Julia I can easily define my own primitive type, let's say a ModInteger (an old video is here [2]) and it will be as fast as a normal "built in" integer in a for loop (and benefit from possible speedups: simd, parallelization, cuda, GPU - but don't have (much) experience in this area).
Likely such a ModInteger couldn't be as easily integrated in Numba?
[1]: https://discourse.julialang.org/t/julia-motivation-why-weren... [2]: (2013) https://www.youtube.com/watch?v=rUczbQ6ZPd8 (at ~37:00 mins)
The other place where Julia might see better speed is when you have an optimization algorithm and the objective function both written in Julia, which allows for optimization across functions, whereas in Python you could write a fast objective function with Numba but couldn't optimize across function call with the optimizer written in Fortran.