undefined | Better HN

0 pointsdevoutsalsa4y ago0 comments

Doesn’t Python offer this speed in it’s scientific libraries, too? Or is the answer “yes, if you use the libraries are written in Fortran, C, C++, or Julia!”?

0 comments

9 comments · 5 top-level

leephillips4y ago· 4 in thread

When those libraries are fast, it is because they are using Numpy routines written in Fortran or C. And you can get a lot done with those libraries, of course. But they’re only fast if your code can be fit into stereotyped vector patterns. As soon as you need to write a loop, you get slow Python performance. Python + Scipy would not be a good choice for writing an ocean circulation or galaxy merger simulation.

EDIT: And last time I checked, Numpy only parallelizes calls to supplied linear algebra routines, and only if you have the right library installed. A simple vector arithmetic operation like a + b will execute on one core only.

spenczar54y ago

I work in research software for astronomy, and I cannot agree with that. A very large amount of astronomy software is in Python. Numba has gone a long way toward making non-vectorized array operations very fast from Python.

Most people use a ton of numpy and scipy. It turns out that phrasing things as array operations with numpy operators is quite natural in this field, including for things like galaxy merger simulations.

I work, in particular, on asteroid detection and orbit simulation, and it's all pretty much Python.

maxnoe4y ago

Numba essentially does the same as julia, compile to llvm bytecode, in julia, that's a language design decision, in python it is a library.

You can get very far with these approaches I python, but having these at the language level just has more potential for optimization and less friction.

The debugability of numba code is very limited and code coverage does ot work at all.

Having a high level language that has scientific use at its core is just great.

Python has the maturity and community size on its side, but Jul is catching up on that quickly.

1 more reply

leephillips4y ago

I’m aware that there is plenty of serious computation done with these tools. I don’t want to overstate; I merely meant that, for a fresh project, Julia is now a better choice for a large-scale simulation. Note that no combination of any of the faster implementations of Python + Numpy libraries has ever been used at the most demanding level of scientific computation. That has always been Fortran, with some C and C++, and now Julia.

“It turns out that phrasing things as array operations with numpy operators is quite natural in this field”

But if A and B are numpy arrays, then A + B will calculate the elementwise sum on a single core only, correct? It will vectorize, but not parallelize. All large-scale computation is multi-core.

1 more reply

devoutsalsaOP4y ago

Out of curiosity, how does someone get into the work you’re doing? Do you just kind of fall into it accidentally? Get a PhD in astronomical computing (if that’s a thing)?

ziotom784y ago

NumPy is not a good comparison, because Julia can produce faster code which takes less memory [1]. The Python library that is closest to Julia's spirit is Numba [2], and in fact I was able to learn Numba in a few hours thanks to my previous exposure to Julia. (It probably helps that they are both based on LLVM, unlike NumPy.)

However, Numba is quite limited because it only works well for mathematical code (it is not able to apply its optimizations to complex objects, like lists of dictionaries), while on the other side Julia's compiler applies its optimizations to everything.

[1] https://discourse.julialang.org/t/comparing-python-julia-and...

[2] https://numba.pydata.org/

SatvikBeri4y ago

There are a few reasons why Julia still tends to be faster than numpy:

* Julia can do loop fusion when broadcasting, while numpy can't, meaning numpy uses a lot more memory during complex operations. (Numba can handle loop fusion, but it's generally much more restrictive.)

* A lot of code in real applications is glue code in Python, which is slow. I've literally found in some applications that <5% of the time was spent in numpy code, despite that being 90% of the code.

That said, if your code is mostly in numba with no pure python glue code (not just numpy), you probably won't see much of a difference.

Hasnep4y ago

You're right, Python and R are good choices if your goals happen to align with what those libraries are optimised for, but outside of that you normally need to start writing your own C or C++.

ska4y ago

> Or is the answer “yes, if you use the libraries are written in Fortran, C, C++, or Julia!”?

That's basically the answer.

j / k navigate · click thread line to collapse

0 comments

9 comments · 5 top-level

leephillips4y ago· 4 in thread

spenczar54y ago

I work, in particular, on asteroid detection and orbit simulation, and it's all pretty much Python.

maxnoe4y ago

Numba essentially does the same as julia, compile to llvm bytecode, in julia, that's a language design decision, in python it is a library.

You can get very far with these approaches I python, but having these at the language level just has more potential for optimization and less friction.

The debugability of numba code is very limited and code coverage does ot work at all.

Having a high level language that has scientific use at its core is just great.

Python has the maturity and community size on its side, but Jul is catching up on that quickly.

1 more reply

leephillips4y ago

“It turns out that phrasing things as array operations with numpy operators is quite natural in this field”

But if A and B are numpy arrays, then A + B will calculate the elementwise sum on a single core only, correct? It will vectorize, but not parallelize. All large-scale computation is multi-core.

1 more reply

devoutsalsaOP4y ago

Out of curiosity, how does someone get into the work you’re doing? Do you just kind of fall into it accidentally? Get a PhD in astronomical computing (if that’s a thing)?

ziotom784y ago

[1] https://discourse.julialang.org/t/comparing-python-julia-and...

[2] https://numba.pydata.org/

SatvikBeri4y ago

There are a few reasons why Julia still tends to be faster than numpy:

* A lot of code in real applications is glue code in Python, which is slow. I've literally found in some applications that <5% of the time was spent in numpy code, despite that being 90% of the code.

That said, if your code is mostly in numba with no pure python glue code (not just numpy), you probably won't see much of a difference.

Hasnep4y ago

You're right, Python and R are good choices if your goals happen to align with what those libraries are optimised for, but outside of that you normally need to start writing your own C or C++.

ska4y ago

> Or is the answer “yes, if you use the libraries are written in Fortran, C, C++, or Julia!”?

That's basically the answer.

j / k navigate · click thread line to collapse