Codon: A high-performance Python-like compiler using LLVM (opens in new tab)

(github.com)

317 pointsarshajii3y ago179 comments

179 comments

129 comments · 32 top-level

ot3y ago· 15 in thread

Can we change the title to say Python-like or something similar? Based on the comments so far, it seems that the detail that it compiles its own Python-inspired language, not actual Python, is lost on many.

EDIT: A list of differences here: https://docs.exaloop.io/codon/general/differences

The summary minimizes with "many Python programs will work with few if any modifications", but it actually looks like a substantially different language.

wiremine3y ago

From the README, for those who didn't scroll that far:

"While Codon supports nearly all of Python's syntax, it is not a drop-in replacement, and large codebases might require modifications to be run through the Codon compiler. For example, some of Python's modules are not yet implemented within Codon, and a few of Python's dynamic features are disallowed. The Codon compiler produces detailed error messages to help identify and resolve any incompatibilities."

darawk3y ago

That list actually seems genuinely pretty minimal. Reading your comment I was expecting a long major list of changes, but it's only 3 things, most of which seem relatively unlikely to impact most programs, with the possible exception of dictionary sort order.

wpietri3y ago

Really? The lack of Unicode strings immediately disqualifies this for most things I've worked on the last few years. No emojis, no diacritics, no non-US users. Ok for internal tools for American companies with an older workforce, I guess, but I wouldn't use this for anything that takes input from the general public (e.g., customers).

1 more reply

antoinealb3y ago

The list of small things are for data structure. However, the language is a lot less dynamic than Python:

> Since Codon performs static type checking ahead of time, a few of Python's dynamic features are disallowed. For example, monkey patching classes at runtime (although Codon supports a form of this at compile time) or adding objects of different types to a collection.

While monkey patching is maybe not done so much in Python (outside of unit testing), adding objects of different to a collection is definitely a common operation!

1 more reply

JonChesterfield3y ago

That list is minimal. Elsewhere there's the no heterogenous lists and no biginteger restrictions, and it looks like import doesn't work either. Presumably no heterogenous dictionaries either - so not only unordered, but also simply typed.

KerrAvon3y ago

Read the entire page. Those three bullet points aren't the extent of it. This is like the difference between Ruby and Crystal; the same syntax, similar culture, but they're fundamentally different languages.

UncleEntity3y ago

> …with the possible exception of dictionary sort order.

Which is an implementation detail that is not guaranteed by the language standard.

Porting code from 2 to 3 made me have to use a sorted dict because the code relied on the insertion order (metaclass magic operating on the class dict) but when they revamped the dictionary implementation I could do away with that fix. Until they come up with a more efficient dict and break everyone’s code again.

Does make me want to dust off the old spark parser and see what this can do with it.

adsharma3y ago

For py2many, there is an informal specification here:

https://github.com/py2many/py2many/blob/main/doc/langspec.md

Would be great if all the authors of "python-like" languages get together and come up with a couple of specs.

I say a couple, because there are ones that support the python runtime (such as cython) and the ones which don't (like py2many).

linuxftw3y ago

Python is a language with several implementations (PyPy, CPython, JPython). Not all python programs work in all of those implementations.

So, I think this might qualify as much as a python implementation as PyPy.

e12e3y ago

I don't think python without heterogenous lists and dictionaries is really python?

1 more reply

williamstein3y ago

You convinced me to change the description of a "Python-like language" I'm working on to say "Python-like" instead of "Python": https://www.npmjs.com/package/pylang

Waterluvian3y ago

The main challenge with those three issues, to me at least, is that it cannot even tell you, "yep, you need minor changes for Codon to work." It'll just work until it doesn't at runtime because your violates one of those three assumptions. So to migrate, we would have to go and figure out all the possible cases those things matter and guard against them. Not really unpalatable, just not so much a nice migration path.

Also, I'm not actually sure what they mean by internal dict sorting. Do they mean insertion order stability?

dang3y ago

Ok, we've liked it in the title above.

jxy3y ago

This may or may not be the biggest concerns.

otikik3y ago

Pythonic?

redleader553y ago· 13 in thread

Free for non-production use... it's a "no" for me.

blahgeek3y ago

It's also confusing... I mean, what does "non-production use" mean anyway? Does it mean "non-commercial use"? Or "testing/debug/staging environment"? Or "does not produce any valuable output"...?

williamstein3y ago

According to this article https://perens.com/2017/02/14/bsl-1-1/ about the Business Source License, the intention conveyed by the word "production" for that license is use "in any capacity that is meant to make money".

2 more replies

worldsavior3y ago

It means if a company uses this program, they can't use it without paying the developer.

2 more replies

sillysaurusx3y ago

Surprised this sentiment is so common. It's like, do you want open source devs to work for free forever? Even Redis had to pivot to a business license.

I'm not sure what the terms are in this particular case, but in general, wanting someone to pay if they're deploying it to lots of customers seems reasonable.

nu11ptr3y ago

I understand both perspectives.

On the one hand, in the same way any sort of fee (even $1) is a big impediment to adoption over free, so is the idea that "if I try this out and like it, I'll now be stuck with whatever costs they stick me with now and in the future". In a fast moving world, open source is just easier because if you change your mind, you haven't wasted money. In addition, there are only so many costs a project can afford and it is hard to know as it progresses where those will popup, so you avoid when you can.

That said, developers need to eat, and it is easy to appreciate the fact that they are letting you see the source, play with it, but pay if you want to use. I also fully support their right to license their software as they see fit.

ipsum23y ago

> Even Redis had to pivot to a business license.

Wrong, Redis has a BSD-3 license: https://github.com/redis/redis

Optional add-ons to Redis may have non-free licenses.

mywittyname3y ago

A lot of developers don't have/control budgets and might not have the clout required to get a tool like this approved.

I agree that the devs of these tools need to be paid, but that particular avenue presents some roadblocks.

lozenge3y ago

I can swap one load balancer or cache for another. However, if my programming language has unfavourable terms I will have to rewrite all my code. Oh, and the knowledge I gained will be non transferable to another job or project because it'll always be a niche language. Better to spend the time learning how to get the same performance out of Numba or whatever other alternatives people mentioned here.

By the way, Redis is still BSD licensed.

redleader553y ago

There are a lot of products released under GPL, for example, that make money for their authors. It's just that they don't make money with licensing fees.

williamstein3y ago

Woah - also, their license automatically becomes open source (Apache) three years from now.

kevincox3y ago

The problem with these is always security updates. If you want to run on the old stuff you have to make your own security patches. Of course maybe that is exactly something that it makes sense to pay for.

ThinkBeat3y ago

I have not dug into the project yet, but if it delivers on the features it mentions it should be a game changer for a lot of companies, heavily invested in Python.

Paying to use it seems fair.

acedTrex3y ago

numba is already a thing though

jnxx3y ago· 9 in thread

Just out of curiosity: Why is it possible to compile Common Lisp Code (or Scheme, or Clojure) to high-performance native or jit-compiled code, but not Python? It is said that "Python is too dynamic", but is not everything in Lisp dynamic, too?

And none of these languages is less powerful than Lisp, lack Unicode support, or whatever, so this can't be the reason.

mytherin3y ago

It is possible to JIT compile Python just fine. There are projects like PyPy that have been doing this for a long time [1]. The reason these alternative projects never take off is because many of Python's most used libraries are written against CPython's C API. This API is giant, and exposes all of the nitty gritty implementation details of the CPython interpreter. As a result, changing anything significant about the implementation of the interpreter means those libraries no longer work. In order to not break compatibility with the enormous amounts of packages the internals of the CPython interpreter are mostly locked in at this point with little wiggle room for large performance improvements.

The only real way out is to make Python 4 - but given the immense pain of the Python 2 -> 3 transition that seems unlikely.

[1] https://www.pypy.org

bb883y ago

To be fair the 2 -> 3 upgrade path was terrible. And there wasn't a killer feature in 3 which was terrible. And the tooling around the upgrade was terrible. Basically the python devs completely botched it -- which was terrible.

So one thing of golang that is nice is that go 1.19 compiler will compile go 1.1 just fine, and people can iterate from 1.1 to 1.19 in their own time -- or not if they choose not to. It would not be that hard for golang v2 to continue to allow similar compilation of old code.

1 more reply

pkkm3y ago

The HPy project [0] looks like a promising way out of this.

[0] https://hpyproject.org/

miohtama3y ago

It’s because Python object attributes can change any time, as they are accessed dynamically. Nothing can be inlined easily. The object structure is pointer heavy.

Here is some old 2014 post:

http://jakevdp.github.io/blog/2014/05/09/why-python-is-slow/

As other commenters pointed out, some of these Python features, which are unused 99,99% time, could be sacrified for additional speedup by breaking backwards compatibility.

pjmlp3y ago

That common excuse doesn't fly in the face of Smalltalk, Lisp, Scheme, SELF, Prolog, JavaScript, Lua.

It is more a matter of wanting to have a JIT in the box or not.

register3y ago

The same applies to Common Lisp. Maybe it's because type deduction is more difficult in Python than in CL?

hathawsh3y ago

The demand for compiled Python hasn't been as high as the demand for other languages, so the number of people who have worked on it is much smaller than the number who have built JITs for ECMAScript and others. Python has long been fast enough for many things, and where it isn't, it's easy to call C code from CPython.

Python does have lesser-used dynamic capabilities that probably don't exist in Common Lisp. Those capabilities make it difficult to optimize arbitrary valid Python code, but most people who need a Python compiler would be happy to make adjustments.

kmod3y ago

Having worked on this for a while, one way that might be helpful to understand this is that Python jits (such as the one I work on, Pyston) do in fact make Python code much faster, but the fraction of the time that is spent in "Python code" is only about 20% to begin with, with the other 80% being spent in the language runtime.

For example if you write `l.sort()` where l is a list, we can make it very fast to figure out that you are calling the list_sort() C function. Unfortunately that function is quite slow because every comparison uses a dynamic multiple-dispatch resolution mechanism in order to implement Python semantics.

JonChesterfield3y ago

If JavaScript can be compiled effectively, and V8 strongly suggests it can, it's hard to see why python couldn't be.

harvie3y ago· 8 in thread

Unfortunately stuff like this never makes it to the upstream. And i am afraid to ask why. We had pypy for years, but never got merged with python. That is why there are still minor incompatibilities between pypy and "The Python", so it's not that useful as it might have been if it got merged with cpython at some point.

dr_zoidberg3y ago

I got a massive jump in performance when moving from Python 3.8 to 3.10 (over some function call optimizations I think, based on the project). And 3.11 got even better (up to 50% faster on special cases, and 10~15% on average) with respect to 3.10. Python 3.12 is already getting even more speedups and a there's a lot more down the road[0].

But Python core developers value keeping "not breaking anyones code" (Python 3 itself was a huge trip on that aspect and they're not making that mistake again), that's why things may seem slow on their end. But work is being done, and the results are there if you benchmark things.

[0] See https://github.com/faster-cpython/ideas/blob/main/FasterCPyt... however that's over a year old already and I'm sure I've read/heard more specifics

pabs33y ago

Python 3.11 broke a lot of stuff in Debian, as did earlier versions of Python 3:

https://bugs.debian.org/cgi-bin/pkgreport.cgi?tag=python3.9&...

LtWorf3y ago

I got performance regression going from 3.10 to 3.11

https://ltworf.github.io/typedload/performance.html

1 more reply

gjvc3y ago

same. for my work project, 3.11 was performance-equivalent to pypy 3.9

amelius3y ago

Interesting. What is the status of the GIL these days?

3 more replies

laerus3y ago

It's not like you can just "merge" pypy into python, they are totally different implementations. CPython is written in C and PyPy is written in RPython which is a subset of the python language that gets compiled, into into an interpreter with JIT support. You can actually write an interpreter for any language using RPython and their toolset, for example Ruby https://github.com/topazproject/topaz

williamstein3y ago

Morever, a wonderful aspect of standard CPython is that you can compile it from source on a huge range of architectures in less than 5 minutes. Building Pypy from source is more difficult, and Pypy is significantly less portable (e.g., there is no viable WebAssembly version of Pypy).

camdenreslink3y ago

No need to be afraid. The Python C extension API makes it very hard to make a JIT work well because of how it is implemented. C extensions are also part of why Python is so popular in the first place. If everybody wrote pure Python (like they write pure JavaScript), then the reference implementation would probably look like Pypy.

munificent3y ago· 7 in thread

Since Codon performs static type checking ahead of time, a few of Python's dynamic features are disallowed. For example, monkey patching classes at runtime (although Codon supports a form of this at compile time) or adding objects of different types to a collection.

This seems like a very different language from Python if it won't let you do:

    [1, 'a string']

didip3y ago

I welcome this change. I am willing to sacrifice a few Python features for the sake of speed.

jnxx3y ago

I have been using Python since 25 years, and never needed that one.

ris3y ago

You've never had to read arbitrary json?

1 more reply

bombolo3y ago

You never implemented a quick polish notation calculator that uses this data structure? [1,2, '+']

1 more reply

joshmaker3y ago

In 25 years you’ve never once created a list with more than one type of object in it?

6 more replies

qwefsdf3y ago

With type hints you would model this as a Union type, i.e., Union[int, str]. This is perfectly legal with mypy and other Python type checkers.

__mharrison__3y ago

Not "Python", but if you are doing that in Python, then you are doing it wrong.

LarsDu883y ago· 6 in thread

Do people not know about numba which unlike this project is FOSS and integrates with numpy???

nerdponx3y ago

And Numba is actually CPython, unlike this which is just "Python-like".

There's also Nuitka as yet another alternative.

Or you're going to use a "Python-like" compiled language, consider using Nim.

killingtime743y ago

Doesn't it require you to annotate every function if you want to compile to a binary? That makes it more like Cython than this. https://numba.readthedocs.io/en/stable/user/pycc.html#overvi...

xapata3y ago

Numba doesn't market itself very well.

ipsum23y ago

For context, Numba also uses the LLVM and it works with Python code via decorators.

bornfreddy3y ago

Does it make sense to use Numba with Django / Flask / FastApi?

LarsDu883y ago

If you're trying to do intense numerical computations on the backend...

qwefsdf3y ago· 5 in thread

Anybody claiming it's almost Python is kidding themselves. This compiler needs to do static type checking. This is inherently impossible in Python. Not just because of some obscure corner cases that nobody uses. It's baked into the language itself.

Reality-check: Why do you think type hints and type checkers like mypy and pyright take such a long time to get going and even they are not there yet? If this was all so easy with just ignoring some obscure rarely used features then mypy would work with essentially no type annotations, all just automatic inferences. Anybody who has tried to work with type annotations in Python knows how hard this is.

So, those guys are quite obviously overselling their product. I can understand it, academic life is hard, and once you've completed your Ph.D., what can you do. You need to stand out. But these claims don't pass the smell test, sorry.

mk_stjames3y ago

Just tried: it hangs on a numpy import. Hell it hangs on an "import time" module. If I cannot reinterpret code that is already written, then I might as well just rewrite code in a language that is better suited. Saying I can 'compile' Python code like this does indeed look like an oversell and a half.

I mean... if you had a 'compiler; for python that looked at my code at runtime- including imports- and all my current input and.... given the data types it sees and nothing more, do type-inference and recompilation down to LLVM and then to my machine code, while taking things that were already calling compiled modules (like numpy) and keeping them separate subroutines and thus only operating on the 'slow' parts of my code... with the speedups therein.. I'd be sold.

Of course, I think I basically just described Julia.

UncleEntity3y ago

If I can write (mostly) python code and get it to run on my GPU they can oversell this all they want.

I have a couple of projects I’ve been wanting to tackle but put off because I like python but it wouldn’t be a very good fit due to performance reasons. Now I get a whole new herd of yaks to shave.

Plus, extensible compiler? Who doesn’t want linq in python?

yablak3y ago

Look at numba, which is not quite but sometimes good enough :)

LargoLasskhyfv3y ago

What about GraalVM? Are they overselling, too?

mrbald3y ago

No, their python support repository readme explicitly tells it’s highly experimental.

arshajiiOP3y ago· 5 in thread

Thanks a lot for all the comments and feedback! Wanted to add a couple points/clarifications:

- Codon is a completely standalone (from CPython) compiler that was started with the goal of statically compiling as much Python code as possible, particularly for scientific computing use cases. We're working on closing the gap further both in what we can statically compile, and by automatically falling back to CPython in cases we can't handle. Some of the examples brought up here are actually in the process of being supported via e.g. union types, which we just added (https://docs.exaloop.io/codon/general/releases).

- You can actually use any plain Python library in Codon (TensorFlow, matplotlib, etc.) — see https://docs.exaloop.io/codon/interoperability/python. The library code will run through Python though, and won't be compiled by Codon. (We are working on a Codon-native NumPy implementation with NumPy-specific compiler optimizations, and might do the same for other popular libraries.)

- We already use Codon and its compiler/DSL framework to build quite a few high-performance scientific DSLs. For example, Seq for bioinformatics (the original motivation for Codon), and others are coming out soon.

Hope you're able to give Codon a try and looking forward to further feedback and suggestions!

hathawsh3y ago

I can see myself using Codon for projects in the future. One thing that concerns me, though, is "automatically falling back to CPython in cases we can't handle". Sometimes, I want the compilation to fail rather than fall back, because sometimes consistent high speed is a requirement. Please keep that in mind as you design that part.

Great work so far!

nhumrich3y ago

Im curious. If codon can compile a python script, why can it not compile a pure python library?

What technical limitations does an import or 3rd party add that a script wouldn't have?

LoganDark3y ago

NumPy, PyTorch, TensorFlow and many other widely known third-party libraries are actually native code that interact with CPython directly.

taylorius3y ago

I'm very interested to try Codon, though I note there are no Windows binaries. Do you think building from source on Windows would be straightforward?

victoryhb3y ago

Excellent job. I can already see this being much more flexible than Numba and much more elegant/easy to use than Cython. Please keep it coming:)

IceHegel3y ago· 5 in thread

would love to see actual benchmarks

memco3y ago

Don't have anything significant, but giving this a quick test with some of my advent of code solutions I found it to be quite a bit slower:

   time python day_2.py             

   ________________________________________________________
   Executed in   57.25 millis    fish           external
      usr time   25.02 millis   52.00 micros   24.97 millis
      sys time   25.01 millis  601.00 micros   24.41 millis


   time codon run -release day_2.py 

   ________________________________________________________
   Executed in  955.58 millis    fish           external
      usr time  923.39 millis   62.00 micros  923.33 millis
      sys time   31.76 millis  685.00 micros   31.07 millis


   time codon run -release day_8.py 

   ________________________________________________________
   Executed in  854.23 millis    fish           external
      usr time  819.11 millis   78.00 micros  819.03 millis
      sys time   34.67 millis  712.00 micros   33.96 millis

   time python day_8.py             

   ________________________________________________________
   Executed in   55.30 millis    fish           external
      usr time   22.59 millis   54.00 micros   22.54 millis
      sys time   25.86 millis  642.00 micros   25.22 millis

It wasn't a ton of work to get running, but I had to comment out some stuff that isn't available. Some notable pain points: I couldn't import code from another file in the same directory and I couldn't do zip(*my_list) because asterisk wasn't supported in that way. I would consider revisiting it if I needed a single-file program that needs to work on someone else's machine if the compilation works as easily as the examples.

arshajiiOP3y ago

I would guess the bulk of the time is being spent in compilation. You might try "codon build -release day_2.py" then "time ./day_2" to measure just runtime.

1 more reply

crucialfelix3y ago

It looks like you are compiling and running. Try compiling to an executable and then benchmark running that

arshajiiOP3y ago

We do have a benchmark suite at https://github.com/exaloop/codon/tree/develop/bench and results on a couple different architectures at https://exaloop.io/benchmarks

_aavaa_3y ago

Why are do the C++ implementations perform so poorly?

1 more reply

slt20213y ago· 4 in thread

power of Python is in ecosystem of libraries, not only Python syntax.

Without the ecosystem of libraries, I am afraid use cases for Codon will be very very limited. Because Python developers (just like Node) got used to thinking: Need to do X? Lets see if I can pip install library that does it.

Ultimately, python is like super flexible glue between ecosystem of libraries that lets anyone build and prototype high quality software very quickly

victoryhb3y ago

If Codon becomes similar enough to Python, it will be trivial to port Python libs to it, thus opening Codon to the vast Python ecosystem.

_frkl3y ago

What's the story with Python libraries that have c-modules/binary parts to it? Would those work? If not, then the previous comment stands, IMHO.

slt20213y ago

The only trivial thing in software is hello world, anything more complicated or useful for end users is usually far from being trivial, in my experience.

dingdingdang3y ago

Perhaps the way forward for Codon in terms of wider adoption would be maintaining a list of libraries that are fully Codon compatible, thereby encouraging devs to aim for cross-compatibility (which would likely naturally exclude usage of a lot of the slower Python features in turn making the libraries faster for both Codon and regular Python users)

weinzierl3y ago· 4 in thread

I want the opposite. Is there a project that compiles to python (either source or bytecode)?

Sort of a graalvm for python?

odo12423y ago

WebAssembly? Try compiling something to WebAssembly and running it in python?

weinzierl3y ago

I haven't thought of that. It's a good idea. I know how to compile to WebAssembly but how do I run it in python?

A quick search leads to pywasm and it is even native python. But is it usable? Any other options?

1 more reply

poulpy1233y ago

I don't remember the name but there is a lisp that compile to python

grumpopotamus3y ago

https://coconut-lang.org/

xapata3y ago· 3 in thread

Ugly, confusing naming choices: ``@par`` instead of ``@parallel``.

jlokier3y ago

How do you feel about `def` instead of `define`?

xapata3y ago

Abbreviations are good to the extent that they're commonly used. It's a bit of a chicken-and-egg problem. At the time Guido picked `def`, I might have argued with him. Now, it's the standard.

If I were writing my own language, I might choose `let` instead of `def`. For example, `let x = 1`.

bombolo3y ago

and abs instead of absolute

sambeau3y ago· 2 in thread

Who would create a language that only has ASCII strings in this day and age?

tasty_freeze3y ago

Here is the quote of the thing you are referring to:

> Codon currently uses ASCII strings unlike Python's unicode strings.

Note the word "currently." Implementing this while also tracking the constantly evolving Python language through its various versions is a lot of work. They apparently prioritizing other things over this particular aspect.

naasking3y ago

Someone who's just trying to get something up and running. Unicode is complicated.

poulpy1233y ago· 2 in thread

after reading it's not a python compiler but a compiled language based on the python syntax

JonChesterfield3y ago

One of the faq things refers in passing to integers being 64bit instead of arbitrary precision. That's a bit more fundamental than some cpython modules don't work. Haven't found a language reference yet.

edit: it's statically typed ahead of time - that feels like something that needs a detailed description of what it's doing, given the baseline of like-python

bogwog3y ago

I wonder if the differences will cause any real compatibility issues with existing Python libraries?

1 more reply

maxloh3y ago· 2 in thread

What's the difference with mypyc [0] ? It also compiles Python to native code.

[0]: https://github.com/mypyc/mypyc

return_to_monke3y ago

the last commit to it's repo was 2 years ago.

okso3y ago

> The mypyc implementation and documentation live in the mypyc subdirectory of the mypy repository. > Since mypyc uses mypy for type checking, it's convenient to use a single repository for both. > Note that the mypyc issue tracker lives in this repository! Please don't file mypyc issues in the mypy issue tracker.

See https://github.com/mypyc/mypyc/blob/master/show_me_the_code....

tomas7893y ago· 2 in thread

What is the difference between Codon and Pypy other than Codon not being targeted as a drop-in replacement for Cpython?

arshajiiOP3y ago

Some info on that at https://docs.exaloop.io/codon/general/faq#how-does-codon-com......

williamstein3y ago

Their benchmarks (https://exaloop.io/benchmarks) show that Codon is much, much faster than pypy. I also just tried some microbenchmarks with their fib example (iterated many times with higher parameters) and got similar results. It's unfortunate for now that this isn't open source, but it's really valuable to demonstrate to us Python lovers what's possible using LLVM!

1 more reply

melenaboija3y ago· 2 in thread

> "...typically on par with (and sometimes better than) that of C/C++"

What makes it faster than C++?

I see this in the documentation but I am not sure it helps me (not an expert):

> C++? Codon often generates the same code as an equivalent C or C++ program. Codon can sometimes generate better code than C/C++ compilers for a variety of reasons, such as better container implementations, the fact that Codon does not use object files and inlines all library code, or Codon-specific compiler optimizations that are not performed with C or C++.

__ryan__3y ago

JIT.

ot3y ago

JIT can be faster than a static compiler if it takes advantage of runtime feedback, but that's not the case here:

> While Codon does offer a JIT decorator similar to Numba's, Codon is in general an ahead-of-time compiler that compiles end-to-end programs to native code.

1 more reply

pmontra3y ago· 1 in thread

I googled 'codon and django' and unsurprisingly found a lot of bioinformatic stuff. I tried to add language and compiler to no avail. The only query that got results was codon python compiler. Overall I think it's a name that clashes with a lot of DNA/RNA research.

While searching I found a paper from 2021 about Codon [1]. The author is not in the About page of Exaloop [2] but the supervisor of that thesis is there. From the "Future Work" section:

> we plan to add union types and inheritance. On the IR side [Intermediate Representation], we hope to develop additional builtin transformations and analyses, all the while expanding the reach of existing passes. As far as library support, we plan to port existing high-performance Python libraries like NumPy [...] to Codon; this will allow Codon to become a drop-in replacement for Python in many domains.

Maybe they already did.

[1] Codon: A Framework for Pythonic Domain-Specific Languages by Gabriel L. Ramirez https://dspace.mit.edu/bitstream/handle/1721.1/139336/Ramire...

[2] https://exaloop.io/about.html

inetknght3y ago

Having worked in the DNA analysis space and admittedly haven't read the article... my first thought was that Codon was some python library for DNA stuffs that gets compiled via LLVM for performance.

PaulHoule3y ago· 1 in thread

How does this relate to

https://cython.org/

Would it be possible to write performance-sensitive parts of a Python system in Codon and link that to a CPython or PyPy runtime that supports more dynamic features?

bastawhiz3y ago

Cython takes python-ish code and compiles it to C for use as CPython C extensions. This compiles directly to machine code without the need for CPython, as far as I can tell.

v3ss0n3y ago· 1 in thread

Why not contribution to PyPy and Why not MyPyC

bastawhiz3y ago

Pypy uses its own JIT. This project does AOT with LLVM. They're not compatible.

MyPyC requires type annotations to work. This does not.

ubj3y ago

Very interesting--Codon can generate standalone executables, object files, and LLVM IR [1]. It has strong typing for functions and argument return values [2]. Syntax looks more compact than Cython.

Looking forward to giving Codon a try!

[1]: https://docs.exaloop.io/codon/general/intro

[2]: https://docs.exaloop.io/codon/language/functions

shadowofneptune3y ago

This gives the same feeling as AssemblyScript: it says it is one language, up to the point it isn't. That may make it easier for some people, but feels so uncertain. Both have a very slim set of articles in place of a proper manual; they lean on their parent languages.

CyberDildonics3y ago

Codon is a high-performance Python compiler that compiles Python code to native machine code without any runtime overhead

Further down:

Codon is a Python-compatible language, and many Python programs will work with few if any modifications:

synergy203y ago

Looks like what taichi(https://github.com/taichi-dev/taichi) is doing, does this support CUDA yet?

additionally how does it compare to numba the compiler for python?

looks like python's performance on ML and AI field will only get stronger.

jarbus3y ago

Things like this are always going to be another point of failure when trying to get something to work. Now when your python code crashes, there's a new reason something could be going wrong, in addition to the countless other reasons.

danbmil993y ago

The number one question for me would be, is it interoperable with existing Python and libraries?

victor823y ago

Seems there is not bytearray implemented, can't test further :(

brrrrrm3y ago

does this support any form of FFI? It'd be nice if users could shim in lightweight APIs that clone libraries like numpy/pytorch -- it'd immediately make some machine learning super portable!

v3ss0n3y ago

Please note, codon is not opensource. It is business source license.

grumpopotamus3y ago

Any benchmark comparisons to mypyc yet?

yayr3y ago

Can it run PyTorch, TF etc?

peter_d_sherman3y ago

>"Typical speedups over Python are on the order of 10-100x or more, on a single thread. Codon's performance is typically on par with (and sometimes better than) that of C/C++"

Nice! A super-fast compiler LLVM compiler for Python! Well done!

You know, if Python is one of the world's most popular languages, and it was originally implemented as a dynamic and interpreted language (but fast compilers can be written for it, as evinced by Codon!) -- then maybe it would make sense to take languages that were implemented as compilers -- and re-implement them as dynamic interpreted languages!

Oh sure -- that would slow them down by 10x to 100x!

But, even though that would be the case -- the dynamic interpreted versions of the previous compiled-only language -- might be a whole lot more beginner friendly!

In other words, typically in dynamic interpreted languages -- a beginner can use a REPL loop or other device -- to make realtime changes to a program as it is running -- something that is usually impossible with a compiled language...

The possibilities for easy logging, debugging, and introspection of a program -- are typically greater/easier -- in interpreted dynamic languages...

Oh sure, someone can do all of those things in compiled languages too -- but typically the additional set-up to accomplish them is more involved and nuanced -- beginners typically can't do those things easily!

So, I think when I think about programming languages from this point forward -- I'm going to think about them as having "two halves":

One half which is a compiled version.

And another half -- which is a dynamic interpreted version...

Usually when a new programming language is created in world, it is created either as a compiled language or as a dynamic interpreted language -- but never both at the same time!

Usually it takes the work of a third party to port a given language from one domain to the other, usually from dynamic interpreted to compiled, but sometimes (as is sometimes the case with scripting languages derived from compiled languages), sometimes in the reverse!

Point is: There are benefits to be derived from each paradigm, both dynamic interpreted and compiled!

So why do we currently look at/think about -- most computer languages -- as either one or the other?

I'm going to be looking at all computer languages as potentially both, from this point forward...

(Related: "Stop Writing Dead Programs" by Jack Rusher (Strange Loop 2022): https://www.youtube.com/watch?v=8Ab3ArE8W3s&t=1383s)

j / k navigate · click thread line to collapse

179 comments

129 comments · 32 top-level

ot3y ago· 15 in thread

EDIT: A list of differences here: https://docs.exaloop.io/codon/general/differences

The summary minimizes with "many Python programs will work with few if any modifications", but it actually looks like a substantially different language.

wiremine3y ago

From the README, for those who didn't scroll that far:

darawk3y ago

wpietri3y ago

1 more reply

antoinealb3y ago

The list of small things are for data structure. However, the language is a lot less dynamic than Python:

While monkey patching is maybe not done so much in Python (outside of unit testing), adding objects of different to a collection is definitely a common operation!

1 more reply

JonChesterfield3y ago

KerrAvon3y ago

UncleEntity3y ago

> …with the possible exception of dictionary sort order.

Which is an implementation detail that is not guaranteed by the language standard.

Does make me want to dust off the old spark parser and see what this can do with it.

adsharma3y ago

For py2many, there is an informal specification here:

https://github.com/py2many/py2many/blob/main/doc/langspec.md

Would be great if all the authors of "python-like" languages get together and come up with a couple of specs.

I say a couple, because there are ones that support the python runtime (such as cython) and the ones which don't (like py2many).

linuxftw3y ago

Python is a language with several implementations (PyPy, CPython, JPython). Not all python programs work in all of those implementations.

So, I think this might qualify as much as a python implementation as PyPy.

e12e3y ago

I don't think python without heterogenous lists and dictionaries is really python?

1 more reply

williamstein3y ago

You convinced me to change the description of a "Python-like language" I'm working on to say "Python-like" instead of "Python": https://www.npmjs.com/package/pylang

Waterluvian3y ago

Also, I'm not actually sure what they mean by internal dict sorting. Do they mean insertion order stability?

dang3y ago

Ok, we've liked it in the title above.

jxy3y ago

This may or may not be the biggest concerns.

otikik3y ago

Pythonic?

redleader553y ago· 13 in thread

Free for non-production use... it's a "no" for me.

blahgeek3y ago

It's also confusing... I mean, what does "non-production use" mean anyway? Does it mean "non-commercial use"? Or "testing/debug/staging environment"? Or "does not produce any valuable output"...?

williamstein3y ago

2 more replies

worldsavior3y ago

It means if a company uses this program, they can't use it without paying the developer.

2 more replies

sillysaurusx3y ago

Surprised this sentiment is so common. It's like, do you want open source devs to work for free forever? Even Redis had to pivot to a business license.

I'm not sure what the terms are in this particular case, but in general, wanting someone to pay if they're deploying it to lots of customers seems reasonable.

nu11ptr3y ago

I understand both perspectives.

ipsum23y ago

> Even Redis had to pivot to a business license.

Wrong, Redis has a BSD-3 license: https://github.com/redis/redis

Optional add-ons to Redis may have non-free licenses.

mywittyname3y ago

A lot of developers don't have/control budgets and might not have the clout required to get a tool like this approved.

I agree that the devs of these tools need to be paid, but that particular avenue presents some roadblocks.

lozenge3y ago

By the way, Redis is still BSD licensed.

redleader553y ago

There are a lot of products released under GPL, for example, that make money for their authors. It's just that they don't make money with licensing fees.

williamstein3y ago

Woah - also, their license automatically becomes open source (Apache) three years from now.

kevincox3y ago

ThinkBeat3y ago

I have not dug into the project yet, but if it delivers on the features it mentions it should be a game changer for a lot of companies, heavily invested in Python.

Paying to use it seems fair.

acedTrex3y ago

numba is already a thing though

jnxx3y ago· 9 in thread

And none of these languages is less powerful than Lisp, lack Unicode support, or whatever, so this can't be the reason.

mytherin3y ago

The only real way out is to make Python 4 - but given the immense pain of the Python 2 -> 3 transition that seems unlikely.

[1] https://www.pypy.org

bb883y ago

1 more reply

pkkm3y ago

The HPy project [0] looks like a promising way out of this.

[0] https://hpyproject.org/

miohtama3y ago

It’s because Python object attributes can change any time, as they are accessed dynamically. Nothing can be inlined easily. The object structure is pointer heavy.

Here is some old 2014 post:

http://jakevdp.github.io/blog/2014/05/09/why-python-is-slow/

As other commenters pointed out, some of these Python features, which are unused 99,99% time, could be sacrified for additional speedup by breaking backwards compatibility.

pjmlp3y ago

That common excuse doesn't fly in the face of Smalltalk, Lisp, Scheme, SELF, Prolog, JavaScript, Lua.

It is more a matter of wanting to have a JIT in the box or not.

register3y ago

The same applies to Common Lisp. Maybe it's because type deduction is more difficult in Python than in CL?

hathawsh3y ago

kmod3y ago

JonChesterfield3y ago

If JavaScript can be compiled effectively, and V8 strongly suggests it can, it's hard to see why python couldn't be.

harvie3y ago· 8 in thread

dr_zoidberg3y ago

[0] See https://github.com/faster-cpython/ideas/blob/main/FasterCPyt... however that's over a year old already and I'm sure I've read/heard more specifics

pabs33y ago

Python 3.11 broke a lot of stuff in Debian, as did earlier versions of Python 3:

https://bugs.debian.org/cgi-bin/pkgreport.cgi?tag=python3.9&...

LtWorf3y ago

I got performance regression going from 3.10 to 3.11

https://ltworf.github.io/typedload/performance.html

1 more reply

gjvc3y ago

same. for my work project, 3.11 was performance-equivalent to pypy 3.9

amelius3y ago

Interesting. What is the status of the GIL these days?

3 more replies

laerus3y ago

williamstein3y ago

camdenreslink3y ago

munificent3y ago· 7 in thread

This seems like a very different language from Python if it won't let you do:

    [1, 'a string']

didip3y ago

I welcome this change. I am willing to sacrifice a few Python features for the sake of speed.

jnxx3y ago

I have been using Python since 25 years, and never needed that one.

ris3y ago

You've never had to read arbitrary json?

1 more reply

bombolo3y ago

You never implemented a quick polish notation calculator that uses this data structure? [1,2, '+']

1 more reply

joshmaker3y ago

In 25 years you’ve never once created a list with more than one type of object in it?

6 more replies

qwefsdf3y ago

With type hints you would model this as a Union type, i.e., Union[int, str]. This is perfectly legal with mypy and other Python type checkers.

__mharrison__3y ago

Not "Python", but if you are doing that in Python, then you are doing it wrong.

LarsDu883y ago· 6 in thread

Do people not know about numba which unlike this project is FOSS and integrates with numpy???

nerdponx3y ago

And Numba is actually CPython, unlike this which is just "Python-like".

There's also Nuitka as yet another alternative.

Or you're going to use a "Python-like" compiled language, consider using Nim.

killingtime743y ago

Doesn't it require you to annotate every function if you want to compile to a binary? That makes it more like Cython than this. https://numba.readthedocs.io/en/stable/user/pycc.html#overvi...

xapata3y ago

Numba doesn't market itself very well.

ipsum23y ago

For context, Numba also uses the LLVM and it works with Python code via decorators.

bornfreddy3y ago

Does it make sense to use Numba with Django / Flask / FastApi?

LarsDu883y ago

If you're trying to do intense numerical computations on the backend...

qwefsdf3y ago· 5 in thread

mk_stjames3y ago

Of course, I think I basically just described Julia.

UncleEntity3y ago

If I can write (mostly) python code and get it to run on my GPU they can oversell this all they want.

Plus, extensible compiler? Who doesn’t want linq in python?

yablak3y ago

Look at numba, which is not quite but sometimes good enough :)

LargoLasskhyfv3y ago

What about GraalVM? Are they overselling, too?

mrbald3y ago

No, their python support repository readme explicitly tells it’s highly experimental.

arshajiiOP3y ago· 5 in thread

Thanks a lot for all the comments and feedback! Wanted to add a couple points/clarifications:

Hope you're able to give Codon a try and looking forward to further feedback and suggestions!

hathawsh3y ago

Great work so far!

nhumrich3y ago

Im curious. If codon can compile a python script, why can it not compile a pure python library?

What technical limitations does an import or 3rd party add that a script wouldn't have?

LoganDark3y ago

NumPy, PyTorch, TensorFlow and many other widely known third-party libraries are actually native code that interact with CPython directly.

taylorius3y ago

I'm very interested to try Codon, though I note there are no Windows binaries. Do you think building from source on Windows would be straightforward?

victoryhb3y ago

Excellent job. I can already see this being much more flexible than Numba and much more elegant/easy to use than Cython. Please keep it coming:)

IceHegel3y ago· 5 in thread

would love to see actual benchmarks

memco3y ago

Don't have anything significant, but giving this a quick test with some of my advent of code solutions I found it to be quite a bit slower:

   time python day_2.py             

   ________________________________________________________
   Executed in   57.25 millis    fish           external
      usr time   25.02 millis   52.00 micros   24.97 millis
      sys time   25.01 millis  601.00 micros   24.41 millis


   time codon run -release day_2.py 

   ________________________________________________________
   Executed in  955.58 millis    fish           external
      usr time  923.39 millis   62.00 micros  923.33 millis
      sys time   31.76 millis  685.00 micros   31.07 millis


   time codon run -release day_8.py 

   ________________________________________________________
   Executed in  854.23 millis    fish           external
      usr time  819.11 millis   78.00 micros  819.03 millis
      sys time   34.67 millis  712.00 micros   33.96 millis

   time python day_8.py             

   ________________________________________________________
   Executed in   55.30 millis    fish           external
      usr time   22.59 millis   54.00 micros   22.54 millis
      sys time   25.86 millis  642.00 micros   25.22 millis

arshajiiOP3y ago

I would guess the bulk of the time is being spent in compilation. You might try "codon build -release day_2.py" then "time ./day_2" to measure just runtime.

1 more reply

crucialfelix3y ago

It looks like you are compiling and running. Try compiling to an executable and then benchmark running that

arshajiiOP3y ago

We do have a benchmark suite at https://github.com/exaloop/codon/tree/develop/bench and results on a couple different architectures at https://exaloop.io/benchmarks

_aavaa_3y ago

Why are do the C++ implementations perform so poorly?

1 more reply

slt20213y ago· 4 in thread

power of Python is in ecosystem of libraries, not only Python syntax.

Ultimately, python is like super flexible glue between ecosystem of libraries that lets anyone build and prototype high quality software very quickly

victoryhb3y ago

If Codon becomes similar enough to Python, it will be trivial to port Python libs to it, thus opening Codon to the vast Python ecosystem.

_frkl3y ago

What's the story with Python libraries that have c-modules/binary parts to it? Would those work? If not, then the previous comment stands, IMHO.

slt20213y ago

The only trivial thing in software is hello world, anything more complicated or useful for end users is usually far from being trivial, in my experience.

dingdingdang3y ago

weinzierl3y ago· 4 in thread

I want the opposite. Is there a project that compiles to python (either source or bytecode)?

Sort of a graalvm for python?

odo12423y ago

WebAssembly? Try compiling something to WebAssembly and running it in python?

weinzierl3y ago

I haven't thought of that. It's a good idea. I know how to compile to WebAssembly but how do I run it in python?

A quick search leads to pywasm and it is even native python. But is it usable? Any other options?

1 more reply

poulpy1233y ago

I don't remember the name but there is a lisp that compile to python

grumpopotamus3y ago

https://coconut-lang.org/

xapata3y ago· 3 in thread

Ugly, confusing naming choices: ``@par`` instead of ``@parallel``.

jlokier3y ago

How do you feel about `def` instead of `define`?

xapata3y ago

Abbreviations are good to the extent that they're commonly used. It's a bit of a chicken-and-egg problem. At the time Guido picked `def`, I might have argued with him. Now, it's the standard.

If I were writing my own language, I might choose `let` instead of `def`. For example, `let x = 1`.

bombolo3y ago

and abs instead of absolute

sambeau3y ago· 2 in thread

Who would create a language that only has ASCII strings in this day and age?

tasty_freeze3y ago

Here is the quote of the thing you are referring to:

> Codon currently uses ASCII strings unlike Python's unicode strings.

naasking3y ago

Someone who's just trying to get something up and running. Unicode is complicated.

poulpy1233y ago· 2 in thread

after reading it's not a python compiler but a compiled language based on the python syntax

JonChesterfield3y ago

edit: it's statically typed ahead of time - that feels like something that needs a detailed description of what it's doing, given the baseline of like-python

bogwog3y ago

I wonder if the differences will cause any real compatibility issues with existing Python libraries?

1 more reply

maxloh3y ago· 2 in thread

What's the difference with mypyc [0] ? It also compiles Python to native code.

[0]: https://github.com/mypyc/mypyc

return_to_monke3y ago

the last commit to it's repo was 2 years ago.

okso3y ago

See https://github.com/mypyc/mypyc/blob/master/show_me_the_code....

tomas7893y ago· 2 in thread

What is the difference between Codon and Pypy other than Codon not being targeted as a drop-in replacement for Cpython?

arshajiiOP3y ago

Some info on that at https://docs.exaloop.io/codon/general/faq#how-does-codon-com......

williamstein3y ago

1 more reply

melenaboija3y ago· 2 in thread

> "...typically on par with (and sometimes better than) that of C/C++"

What makes it faster than C++?

I see this in the documentation but I am not sure it helps me (not an expert):

__ryan__3y ago

JIT.

ot3y ago

JIT can be faster than a static compiler if it takes advantage of runtime feedback, but that's not the case here:

> While Codon does offer a JIT decorator similar to Numba's, Codon is in general an ahead-of-time compiler that compiles end-to-end programs to native code.

1 more reply

pmontra3y ago· 1 in thread

While searching I found a paper from 2021 about Codon [1]. The author is not in the About page of Exaloop [2] but the supervisor of that thesis is there. From the "Future Work" section:

Maybe they already did.

[1] Codon: A Framework for Pythonic Domain-Specific Languages by Gabriel L. Ramirez https://dspace.mit.edu/bitstream/handle/1721.1/139336/Ramire...

[2] https://exaloop.io/about.html

inetknght3y ago

Having worked in the DNA analysis space and admittedly haven't read the article... my first thought was that Codon was some python library for DNA stuffs that gets compiled via LLVM for performance.

PaulHoule3y ago· 1 in thread

How does this relate to

https://cython.org/

Would it be possible to write performance-sensitive parts of a Python system in Codon and link that to a CPython or PyPy runtime that supports more dynamic features?

bastawhiz3y ago

Cython takes python-ish code and compiles it to C for use as CPython C extensions. This compiles directly to machine code without the need for CPython, as far as I can tell.

v3ss0n3y ago· 1 in thread

Why not contribution to PyPy and Why not MyPyC

bastawhiz3y ago

Pypy uses its own JIT. This project does AOT with LLVM. They're not compatible.

MyPyC requires type annotations to work. This does not.

ubj3y ago

Very interesting--Codon can generate standalone executables, object files, and LLVM IR [1]. It has strong typing for functions and argument return values [2]. Syntax looks more compact than Cython.

Looking forward to giving Codon a try!

[1]: https://docs.exaloop.io/codon/general/intro

[2]: https://docs.exaloop.io/codon/language/functions

shadowofneptune3y ago

CyberDildonics3y ago

Codon is a high-performance Python compiler that compiles Python code to native machine code without any runtime overhead

Further down:

Codon is a Python-compatible language, and many Python programs will work with few if any modifications:

synergy203y ago

Looks like what taichi(https://github.com/taichi-dev/taichi) is doing, does this support CUDA yet?

additionally how does it compare to numba the compiler for python?

looks like python's performance on ML and AI field will only get stronger.

jarbus3y ago

danbmil993y ago

The number one question for me would be, is it interoperable with existing Python and libraries?

victor823y ago

Seems there is not bytearray implemented, can't test further :(

brrrrrm3y ago

does this support any form of FFI? It'd be nice if users could shim in lightweight APIs that clone libraries like numpy/pytorch -- it'd immediately make some machine learning super portable!

v3ss0n3y ago

Please note, codon is not opensource. It is business source license.

grumpopotamus3y ago

Any benchmark comparisons to mypyc yet?

yayr3y ago

Can it run PyTorch, TF etc?

peter_d_sherman3y ago

>"Typical speedups over Python are on the order of 10-100x or more, on a single thread. Codon's performance is typically on par with (and sometimes better than) that of C/C++"

Nice! A super-fast compiler LLVM compiler for Python! Well done!

Oh sure -- that would slow them down by 10x to 100x!

But, even though that would be the case -- the dynamic interpreted versions of the previous compiled-only language -- might be a whole lot more beginner friendly!

The possibilities for easy logging, debugging, and introspection of a program -- are typically greater/easier -- in interpreted dynamic languages...

So, I think when I think about programming languages from this point forward -- I'm going to think about them as having "two halves":

One half which is a compiled version.

And another half -- which is a dynamic interpreted version...

Usually when a new programming language is created in world, it is created either as a compiled language or as a dynamic interpreted language -- but never both at the same time!

Point is: There are benefits to be derived from each paradigm, both dynamic interpreted and compiled!

So why do we currently look at/think about -- most computer languages -- as either one or the other?

I'm going to be looking at all computer languages as potentially both, from this point forward...

(Related: "Stop Writing Dead Programs" by Jack Rusher (Strange Loop 2022): https://www.youtube.com/watch?v=8Ab3ArE8W3s&t=1383s)

j / k navigate · click thread line to collapse