While i think there are tons of optimizations to be done for python (looking at you GIL) giving access to low level cpu primitives is not one I think that will be broadly adopted by the python community. That's one of the joys of python: system agnostic, looks pretty close to pseudocode, coding. If you want speed, glue together a bunch of compiled code calls, and hope the call overhead isn't too large. Or write cpu intensive operations in numba, or pyrex. At the end of the day, mojo's pay to play programming language harkens back to the early 90's Borland days.
Right. However, this is a comparison versus Python and the GIL, which can’t do that at all.
> While i think there are tons of optimizations to be done for python (looking at you GIL) giving access to low level cpu primitives is not one I think that will be broadly adopted by the python community.
It doesn’t need to be, any more than writing Numba or Pyrex is done on a large scale.
> That's one of the joys of python: system agnostic, looks pretty close to pseudocode, coding. If you want speed, glue together a bunch of compiled code calls, and hope the call overhead isn't too large. Or write cpu intensive operations in numba, or pyrex. At the end of the day, mojo's pay to play programming language harkens back to the early 90's Borland days.
The appeal is having a high level language that compiles to efficient machine (and GPU!) code. One can “drop down” to Python for non performance intensive parts.
I think this will be much more of a draw for people coming from C++, Fortran and other older, jankier languages. It looks to hit a sweet spot for real time embedded development VERY well, especially given Rust-like memory safety!
Mojo will also be a worthy competitor to Julia in the HPC scientific arena I think…we’ll see!
I feel like JAX has been eating Julia’s lunch lately, making me think that there’s a real market for a small functional differentiable programming language with good Python interop - like a more polished Dex or Futhark.
Single process python does not take advantage of a multicore architecture but neither would single process mojo. Embarrassingly parallel operations like mandlebrot can trivially be written with multiprocessing (https://github.com/DipanshuSehjal/Mandelbrot-set/blob/master...), or joblib to run in parallel in otherwise vanilla python. It would be trivial to implement this in jax and run on a gpu or tpu, but i wouldn't say that jax is the reason for the speed up.
I didn’t address this in my other post. Modular is about to release a freely available SDK. Also, the standard library sources will be open sourced shortly. There are hints of additional open source initiatives.
Modular’s main business plan appears to be adding value in the general area of AI, AI training, and AI deployment, including by offering SAAS. That plan in no way conflicts with (and in fact encourages) an open Mojo language ecosystem.
In Java land we had a bunch of other JVMs over the years offering better performance. Most important things got absorbed into what is now OpenJDK, and the other JVMs, if they even exist at all, are niche players.
Performance is a huge focus in Python and ML lands right now, so why would this be any different?
I guess it's possible that these features will be introduced into cpython etc. but I doubt it.
I have the impression they hope vendors of AI acceleration hardware, clusters and cloud services will be their customers, to provide uniform and heavily backward-compatible cross-acclerator AI/ML APIs to those vendors' customers.
And hope that users of those services and hardware will also pay for high quality well-researched APIs that work reliably with many different AI/ML accelerators, even if Mojo is free. Similar to how RedHat provides value through commercial-grade QA and sustained development for Linux on high-end hardware, that would be complicated and risky to use otherwise.
Ultimately promoting the possibility for better performance, & current contrast, is good for prodding other languages/runtimes like Python to match these options. The "important things [get] absorbed" process you mention relies on teams making some "play for" alternatives, to create the impetus to get new things integrated.
> The above code produced a 90x speedup over Python and a 15x speedup over NumPy as shown in the figure below:
Am I missing something?
I’ll take it.
This is all pretty impressive if I can take my unmodified (slightly modified?) Python code and get that sort of improvement.
it'll never work as smoothly as they advertise. just hands down, beyond a shadow of a doubt, their claims about supporting "unmodified" Python code are startup hype. how do i know? i could give you a bunch of technical reasons about Python as a language and CPython as the de facto implementation (thereby informing tons of code already written, re extensions) but there's a much simpler way to reason about it: because there are already >10 attempts at this and no one has been able to do it. there's no magic here that any number of dollars or brains could pull off. instead each such project picks a point on the pythonic<->performant design-space tradeoff curve and then asks/expects you to live with that choice.
and taking ^ into consideration, mojo is not that special. only thing going for it is chris lattner isn't bad at designing languages so maybe, on its own, it'll be a nice language (but it needs to be open to get any traction on its own).
Well, isn't that most Python? If Mojo can pave over the slow interpreted bits I repeatedly dig up in Python profilers, even well maintained projects, with no code changes, that would be huge.
Also, Swift isn’t very interesting outside the Apple ecosystem, and Metal doesn’t exist outside the Apple ecosystem. Mojo has a real shot at widespread, general-purpose, language adoption!
Furthermore, while I love Julia the language, I'm disappointed in how it really hasn't taken off in adoption by either academia or industry. The community is small and that becomes a real pain point when it comes to tooling. Using the debugger is an awful experience and the VSCode extension that is recommended way to write Julia is very hit-or-miss. I think it would really benefit from a lot more funding that doesn't actually seem to be coming. It's not a 1-to-1 comparison, but Modular has received 3 times the amount of funding as JuliaHub despite being much younger.
For the time being, my chips are still on the Julia horse.
More on GPE if you're curious: https://llvm.org/devmtg/2018-10/slides/Hong-Lattner-SwiftFor...
…BUT…
For my personal tastes, Mojo’s lack of garbage collection, Rust-like memory safety, and attention to ahead-of-time compilation put it way ahead. The vast pool of Python developers who can easily pick it up if interested is a big plus.
Julia is aimed at a somewhat different space, but there’s also a huge overlap.
Let’s hope for good interoperability between the two, it seems fairly straightforward…
Other languages have failed for less visible reasons.
Actually more interested in things like UIs, quick API servers, stuff like that than the AI/ML use cases. The idea of most of the ease and approachability of Python, a proper type system, and access to the entire ecosystem of Python libs in a compiled language is pretty compelling.
So for a Python programmer with a performance problem, it doesn't look like a solution.
How would you differentiate mojo code from vanilla python without a ton of boilerplate at language boundaries.
I think I should be impressed, but I feel like I’m missing the point.
Throw a half dozen engineers at it, develop a deployment plan for SD XL, profit.
You'll get a ton of open source developers working on improving the Mojo versions even further once you release it, researchers developing extensions, etc. GO TO WHERE THE DEVELOPERS ARE.
Stable Diffusion is crazy compute heavy, so if Mojo is what it's purported to be, it should be possible to get speedups.
I want a Python that can statically plan underlying GPU allocations, avoids CUDA kernel dispatch overhead and enables a multi-GPU API that isn't some multiprocessing abomination.