Reality-check: Why do you think type hints and type checkers like mypy and pyright take such a long time to get going and even they are not there yet? If this was all so easy with just ignoring some obscure rarely used features then mypy would work with essentially no type annotations, all just automatic inferences. Anybody who has tried to work with type annotations in Python knows how hard this is.
So, those guys are quite obviously overselling their product. I can understand it, academic life is hard, and once you've completed your Ph.D., what can you do. You need to stand out. But these claims don't pass the smell test, sorry.
I mean... if you had a 'compiler; for python that looked at my code at runtime- including imports- and all my current input and.... given the data types it sees and nothing more, do type-inference and recompilation down to LLVM and then to my machine code, while taking things that were already calling compiled modules (like numpy) and keeping them separate subroutines and thus only operating on the 'slow' parts of my code... with the speedups therein.. I'd be sold.
Of course, I think I basically just described Julia.
I have a couple of projects I’ve been wanting to tackle but put off because I like python but it wouldn’t be a very good fit due to performance reasons. Now I get a whole new herd of yaks to shave.
Plus, extensible compiler? Who doesn’t want linq in python?
- Codon is a completely standalone (from CPython) compiler that was started with the goal of statically compiling as much Python code as possible, particularly for scientific computing use cases. We're working on closing the gap further both in what we can statically compile, and by automatically falling back to CPython in cases we can't handle. Some of the examples brought up here are actually in the process of being supported via e.g. union types, which we just added (https://docs.exaloop.io/codon/general/releases).
- You can actually use any plain Python library in Codon (TensorFlow, matplotlib, etc.) — see https://docs.exaloop.io/codon/interoperability/python. The library code will run through Python though, and won't be compiled by Codon. (We are working on a Codon-native NumPy implementation with NumPy-specific compiler optimizations, and might do the same for other popular libraries.)
- We already use Codon and its compiler/DSL framework to build quite a few high-performance scientific DSLs. For example, Seq for bioinformatics (the original motivation for Codon), and others are coming out soon.
Hope you're able to give Codon a try and looking forward to further feedback and suggestions!
Great work so far!
What technical limitations does an import or 3rd party add that a script wouldn't have?
This seems like a very different language from Python if it won't let you do:
[1, 'a string']While searching I found a paper from 2021 about Codon [1]. The author is not in the About page of Exaloop [2] but the supervisor of that thesis is there. From the "Future Work" section:
> we plan to add union types and inheritance. On the IR side [Intermediate Representation], we hope to develop additional builtin transformations and analyses, all the while expanding the reach of existing passes. As far as library support, we plan to port existing high-performance Python libraries like NumPy [...] to Codon; this will allow Codon to become a drop-in replacement for Python in many domains.
Maybe they already did.
[1] Codon: A Framework for Pythonic Domain-Specific Languages by Gabriel L. Ramirez https://dspace.mit.edu/bitstream/handle/1721.1/139336/Ramire...
Looking forward to giving Codon a try!
But Python core developers value keeping "not breaking anyones code" (Python 3 itself was a huge trip on that aspect and they're not making that mistake again), that's why things may seem slow on their end. But work is being done, and the results are there if you benchmark things.
[0] See https://github.com/faster-cpython/ideas/blob/main/FasterCPyt... however that's over a year old already and I'm sure I've read/heard more specifics
https://bugs.debian.org/cgi-bin/pkgreport.cgi?tag=python3.9&...
Without the ecosystem of libraries, I am afraid use cases for Codon will be very very limited. Because Python developers (just like Node) got used to thinking: Need to do X? Lets see if I can pip install library that does it.
Ultimately, python is like super flexible glue between ecosystem of libraries that lets anyone build and prototype high quality software very quickly
And none of these languages is less powerful than Lisp, lack Unicode support, or whatever, so this can't be the reason.
The only real way out is to make Python 4 - but given the immense pain of the Python 2 -> 3 transition that seems unlikely.
So one thing of golang that is nice is that go 1.19 compiler will compile go 1.1 just fine, and people can iterate from 1.1 to 1.19 in their own time -- or not if they choose not to. It would not be that hard for golang v2 to continue to allow similar compilation of old code.
Here is some old 2014 post:
http://jakevdp.github.io/blog/2014/05/09/why-python-is-slow/
As other commenters pointed out, some of these Python features, which are unused 99,99% time, could be sacrified for additional speedup by breaking backwards compatibility.
Python does have lesser-used dynamic capabilities that probably don't exist in Common Lisp. Those capabilities make it difficult to optimize arbitrary valid Python code, but most people who need a Python compiler would be happy to make adjustments.
For example if you write `l.sort()` where l is a list, we can make it very fast to figure out that you are calling the list_sort() C function. Unfortunately that function is quite slow because every comparison uses a dynamic multiple-dispatch resolution mechanism in order to implement Python semantics.
> Codon currently uses ASCII strings unlike Python's unicode strings.
Note the word "currently." Implementing this while also tracking the constantly evolving Python language through its various versions is a lot of work. They apparently prioritizing other things over this particular aspect.
edit: it's statically typed ahead of time - that feels like something that needs a detailed description of what it's doing, given the baseline of like-python
EDIT: A list of differences here: https://docs.exaloop.io/codon/general/differences
The summary minimizes with "many Python programs will work with few if any modifications", but it actually looks like a substantially different language.
"While Codon supports nearly all of Python's syntax, it is not a drop-in replacement, and large codebases might require modifications to be run through the Codon compiler. For example, some of Python's modules are not yet implemented within Codon, and a few of Python's dynamic features are disallowed. The Codon compiler produces detailed error messages to help identify and resolve any incompatibilities."
> Since Codon performs static type checking ahead of time, a few of Python's dynamic features are disallowed. For example, monkey patching classes at runtime (although Codon supports a form of this at compile time) or adding objects of different types to a collection.
While monkey patching is maybe not done so much in Python (outside of unit testing), adding objects of different to a collection is definitely a common operation!
Which is an implementation detail that is not guaranteed by the language standard.
Porting code from 2 to 3 made me have to use a sorted dict because the code relied on the insertion order (metaclass magic operating on the class dict) but when they revamped the dictionary implementation I could do away with that fix. Until they come up with a more efficient dict and break everyone’s code again.
Does make me want to dust off the old spark parser and see what this can do with it.
https://github.com/py2many/py2many/blob/main/doc/langspec.md
Would be great if all the authors of "python-like" languages get together and come up with a couple of specs.
I say a couple, because there are ones that support the python runtime (such as cython) and the ones which don't (like py2many).
So, I think this might qualify as much as a python implementation as PyPy.
Also, I'm not actually sure what they mean by internal dict sorting. Do they mean insertion order stability?
This may or may not be the biggest concerns.
See https://github.com/mypyc/mypyc/blob/master/show_me_the_code....
There's also Nuitka as yet another alternative.
Or you're going to use a "Python-like" compiled language, consider using Nim.
Further down:
Codon is a Python-compatible language, and many Python programs will work with few if any modifications:
What makes it faster than C++?
I see this in the documentation but I am not sure it helps me (not an expert):
> C++? Codon often generates the same code as an equivalent C or C++ program. Codon can sometimes generate better code than C/C++ compilers for a variety of reasons, such as better container implementations, the fact that Codon does not use object files and inlines all library code, or Codon-specific compiler optimizations that are not performed with C or C++.
If I were writing my own language, I might choose `let` instead of `def`. For example, `let x = 1`.
time python day_2.py
________________________________________________________
Executed in 57.25 millis fish external
usr time 25.02 millis 52.00 micros 24.97 millis
sys time 25.01 millis 601.00 micros 24.41 millis
time codon run -release day_2.py
________________________________________________________
Executed in 955.58 millis fish external
usr time 923.39 millis 62.00 micros 923.33 millis
sys time 31.76 millis 685.00 micros 31.07 millis
time codon run -release day_8.py
________________________________________________________
Executed in 854.23 millis fish external
usr time 819.11 millis 78.00 micros 819.03 millis
sys time 34.67 millis 712.00 micros 33.96 millis
time python day_8.py
________________________________________________________
Executed in 55.30 millis fish external
usr time 22.59 millis 54.00 micros 22.54 millis
sys time 25.86 millis 642.00 micros 25.22 millis
It wasn't a ton of work to get running, but I had to comment out some stuff that isn't available. Some notable pain points: I couldn't import code from another file in the same directory and I couldn't do zip(*my_list) because asterisk wasn't supported in that way. I would consider revisiting it if I needed a single-file program that needs to work on someone else's machine if the compilation works as easily as the examples.additionally how does it compare to numba the compiler for python?
looks like python's performance on ML and AI field will only get stronger.
?
Would it be possible to write performance-sensitive parts of a Python system in Codon and link that to a CPython or PyPy runtime that supports more dynamic features?
I'm not sure what the terms are in this particular case, but in general, wanting someone to pay if they're deploying it to lots of customers seems reasonable.
On the one hand, in the same way any sort of fee (even $1) is a big impediment to adoption over free, so is the idea that "if I try this out and like it, I'll now be stuck with whatever costs they stick me with now and in the future". In a fast moving world, open source is just easier because if you change your mind, you haven't wasted money. In addition, there are only so many costs a project can afford and it is hard to know as it progresses where those will popup, so you avoid when you can.
That said, developers need to eat, and it is easy to appreciate the fact that they are letting you see the source, play with it, but pay if you want to use. I also fully support their right to license their software as they see fit.
Wrong, Redis has a BSD-3 license: https://github.com/redis/redis
Optional add-ons to Redis may have non-free licenses.
I agree that the devs of these tools need to be paid, but that particular avenue presents some roadblocks.
By the way, Redis is still BSD licensed.
Sort of a graalvm for python?
A quick search leads to pywasm and it is even native python. But is it usable? Any other options?
MyPyC requires type annotations to work. This does not.
Nice! A super-fast compiler LLVM compiler for Python! Well done!
You know, if Python is one of the world's most popular languages, and it was originally implemented as a dynamic and interpreted language (but fast compilers can be written for it, as evinced by Codon!) -- then maybe it would make sense to take languages that were implemented as compilers -- and re-implement them as dynamic interpreted languages!
Oh sure -- that would slow them down by 10x to 100x!
But, even though that would be the case -- the dynamic interpreted versions of the previous compiled-only language -- might be a whole lot more beginner friendly!
In other words, typically in dynamic interpreted languages -- a beginner can use a REPL loop or other device -- to make realtime changes to a program as it is running -- something that is usually impossible with a compiled language...
The possibilities for easy logging, debugging, and introspection of a program -- are typically greater/easier -- in interpreted dynamic languages...
Oh sure, someone can do all of those things in compiled languages too -- but typically the additional set-up to accomplish them is more involved and nuanced -- beginners typically can't do those things easily!
So, I think when I think about programming languages from this point forward -- I'm going to think about them as having "two halves":
One half which is a compiled version.
And another half -- which is a dynamic interpreted version...
Usually when a new programming language is created in world, it is created either as a compiled language or as a dynamic interpreted language -- but never both at the same time!
Usually it takes the work of a third party to port a given language from one domain to the other, usually from dynamic interpreted to compiled, but sometimes (as is sometimes the case with scripting languages derived from compiled languages), sometimes in the reverse!
Point is: There are benefits to be derived from each paradigm, both dynamic interpreted and compiled!
So why do we currently look at/think about -- most computer languages -- as either one or the other?
I'm going to be looking at all computer languages as potentially both, from this point forward...
(Related: "Stop Writing Dead Programs" by Jack Rusher (Strange Loop 2022): https://www.youtube.com/watch?v=8Ab3ArE8W3s&t=1383s)