1. Can unoptimised Julia code run faster than unoptimised Python code (with numpy being used to do the heavy lifting)?
Let's say one is prototyping some algorithm so iteration speed is more relevant than running speed. Then one can choose either Julia or Python (with the help of numpy perhaps) and get an implementation in similar timeframes. So Julia won't necessarily be more attractive here.
Now if the prototype proved that running speed is very critical to the successful application of the algorithm, then it would mean the developer now has to optimise the hell out of it. One can either:
1. Optimise the Julia codebase, if Julia was used to prototype, following the many tips and tricks available (e.g. type stability, various macros, etc.).
2. Port the algorithm to C/C++, applying the many performance best practices that people have accumulated over the years.
So if the optimised C/C++ port is capable of being any faster than the optimised Julia code, then the rational choice would be to port the implementation using C/C++; it would also mean Python would have some advantage over Julia in the prototyping phase too due to its popularity. Otherwise I'd agree that using a single language to both do prototyping and production is the best.