Already a substantial amount of work has gone into making Ruby, R, node and Python to work.
This is the Python implementation - https://github.com/securesystemslab/zippy
They have an example where this code is compiled:
def sumitup(n):
total = 0
for i in range(n):
total = total + i
return total
It was optimized quite well, but still had loops. I know this is a lot to ask, but I would have expected it to be possible to specialize it to a loopless variant: def sumitup(n):
if n < 0:
return 0
else:
return n*(n-1) // 2
Clang 4+ actually finds this optimization, but gcc and icc doesn't seem to: https://godbolt.org/g/v4zhrmThat is, the assembly generated by clang seems to be equivalent to
if n <= 0:
return 0
else:
return ((n-1)*(n-2) >> 1) + (n-1)
which might even run faster on the CPU, due to the speficic code emitted.Copyright (c) Regents of the University of California and individual contributors.
On the wiki home page, we have :
Author Wei Zhang, Facebook, Inc. Mohaned Qunaibit, University of California Irvine
So basically, this is half owned by FaceBook... Right ?
I'll try to just talk about what makes Python hard to run quickly (especially as compared to less-dynamic languages like JS or Lua).
The thing I wish people understood about Python performance is that the difficulties come from Python's extremely rich object model, not from anything about its dynamic scopes or dynamic types. The problem is that every
operation in Python will typically have multiple points at which the user can override the behavior, and these features are used, often very extensively. Some examples are inspecting the locals of a frame after the frame
has exited, mutating functions in-place, or even something as banal as overriding isinstance. These are all things that we had to support, and are used enough that we have to support efficiently, and don't have analogs in
less-dynamic languages like JS or Lua.Sadly it looks like it's stalled :(