Python 3.15's JIT is now back on track (opens in new tab)

(fidget-spinner.github.io)

483 pointsguidoiaquinti3mo ago310 comments

310 comments

130 comments · 14 top-level

Python really needs to take the Typescript approach of "all valid Python4 is valid Python3". And then add value types so we can have int64 etc. And allow object refs to be frozen after instantiation to avoid the indirection tax.

Sensible type-annotated python code could be so much faster if it didn't have to assume everything could change at any time. Most things don't change, and if they do they change on startup (e.g. ORM bindings).

mattclarkdotnet3mo ago

To clarify, it is nuts that in an object method, there is a performance enhancement through caching a member value.

  class SomeClass
    def init(self)
      self.x = 0
    def SomeMethod(self)
      q = self.x
      ## do stuff with q, because otherwise you're dereferencing self.x all the damn time

17186274403mo ago

This is not just a performance concern, this describes completely different behaviour. You forgot that self.x is just Class.__getattr__(self, 'x') and that you can implement __getattr__ how you like. There is no object identity across the values returned by __getattr__.

1 more reply

dekelpilli3mo ago

Java also has a performance cost to accessing class fields, as exampled by this (now-replaced) code in the JDK itself - https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/...

2 more replies

mathisfun1233mo ago

> it is nuts that in an object method, there is a performance enhancement through caching a member value

i don't understand what you think is nuts about this. it's an interpreted language and the word `self` is not special in any way (it's just convention - you can call the first param to a method anything you want). so there's no way for the interpreter/compiler/runtime to know you're accessing a field of the class itself (let alone that that field isn't a computed property or something like that).

lots of hottakes that people have (like this one) are rooted in just a fundamental misunderstanding of the language and programming languages in general <shrugs>.

4 more replies

duskdozer3mo ago

You mean even if x is not a property?

stabbles3mo ago

That was how the Mojo language started. And then soon after the hype they said that being a superset of Python was no longer the goal. Probably because being a superset of Python is not a guarantee for performance either.

Hendrikto3mo ago

Being a superset would mean all valid Python 3 is valid Python 4. A valuable property for sure, but not what OP suggested. In fact, it is the exact opposite.

bloppe3mo ago

But that's just not what python is for. Move your performance-critical logic into a native module.

mattclarkdotnet3mo ago

Performance is one part of the discussion, but cleanliness is another. A Python4 that actually used typing in the interpreter, had value types, had a comptime phase to allow most metaprogramming to work (like monkey patching for tests) would be great! It would be faster, cleaner, easier to reason about, and still retain the great syntax and flexibility of the language.

2 more replies

wiseowise3mo ago

I’ll be happy if over night all Python code in the world can reap 10-100x performance benefits without changing much of a codebase, you can continue having soup of multiple languages.

3 more replies

fermigier3mo ago

I have made some experiments with P2W, my experimental Python (subset) to WASM compiler. Initial figures are encouraging (5x speedup, on specific programs).

https://github.com/abilian/p2w

NB: some preliminary results:

  p2w is 4.03x SLOWER than gcc (geometric mean)

  p2w is 5.50x FASTER than cpython (geometric mean)

  p2w is 1.24x FASTER than pypy (geometric mean)

BerislavLopac3mo ago

> python code could be so much faster if it didn't have to assume everything could change at any time

Definitely, but then it wouldn't be Python. One of the core principles of Python's design is to be extremely dynamic, and that anything can change at any time.

There are many other, pretty good, strictly dynamically typed languages which work just as well if not better than Python, for many purposes.

oblio3mo ago

I feel that this excuse is being trotted out too much. Most engineers never get to choose the programming language used for 90% of their professional projects.

And when Python is a mainstream language on top of which large, globally known websites, AI tools, core system utilities, etc are built, we should give up the purity angle and be practical.

Even the new performance push in Python land is a reflection of this. A long time ago some optimizations were refused in order to not complicate the default Python implementation.

1 more reply

wolvesechoes3mo ago

> Python really needs to take the Typescript approach of "all valid Python4 is valid Python3"

It is called type hints, and is already there. TS typing doesn't bring any perf benefits over plain JS.

stabbles3mo ago

You really need dedicated types for `int64` and something like `final`. Consider:

    class Foo:
      __slots__ = ("a", "b")
      a: int
      b: float

there are multiple issues with Python that prevent optimizations:

* a user can define subtype `class my_int(int)`, so you cannot optimize the layout of `class Foo`

* the builtin `int` and `float` are big-int like numbers, so operations on them are branchy and allocating.

and the fact that Foo is mutable and that `id(foo.a)` has to produce something complicates things further.

1 more reply

phkahler3mo ago

>> Sensible type-annotated python code could be so much faster if it didn't have to assume everything could change at any time.

Then it wouldn't be Python any more.

0xffff23mo ago

Fine by me. I don't particularly like Python, but it's the defacto standard in my field so I have to use it (admittedly this is an improvement over a decade ago, when MATLAB was the defacto standard). I don't care about preserving the spirit of Python, I just care that the thing that bears the name Python meets my needs.

1 more reply

germandiago3mo ago

I share your view. Python's flexibility is central to Python.

Even type annotations, though useful, can get in the way for certain tasks.Betting on things like these to speed up things would be a mistake, since it would kind of force you to follow that style.

Anything that accelearates things should rely on run-time data, not on type annotations that won't change.

panzi3mo ago

Isn't rpython doing that, allowing changes on startup and then it's basically statically typed? Does it still exist? Was it ever production ready? I only once read a paper about it decades ago.

zahlman3mo ago

It exists in the sense that PyPy exists.

As far as I can tell, it only ever existed to make PyPy possible, and was only defined/specified in terms of PyPy's needs.

mattclarkdotnet3mo ago

RPython is great, but it changes semantics in all sorts of ways. No sets for example. WTF? The native Set type is one of the best features of Python. Tuples also get mangled in RPython.

rich_sasha3mo ago

I think sadly a lot of Python in the wild relies heavily, somewhere, on the crazy unoptimisable stuff. For example pytest monkey patches everything everywhere all the time.

You could make this clean break and call it Python 4 but frankly I fear it won't be Python anymore.

fyrn_3mo ago

As a person who has spent a lot of time with pytest, I'm ready for testing framework that doesn't do any of that non-obvious stuff. Generally use unittest as much as I can these days, so much less _wierd_ about how it does things. Like jeeze pytest, do you _really_ need to stress test every obscure language feature? Your job is to call tests.

1 more reply

germandiago3mo ago

If you do that you then have a less productive language for many use cases IMHO.

All the dynamism from Python should stay where it is.

Just JIT and remember a type maybe, but do not force a type from a type hint or such things.

As a minimum, I would say not relying on that is the correct thing. You could exploit it, but not force it to change the semantics.

1 more reply

mattclarkdotnet3mo ago

Allowing metaprogramming at module import (or another defined phase) would cover most monkey patching use cases. From __future__ import python4 would allow developers to declare their code optimisable.

NetMageSCW3mo ago

Perl 6 showed what happens when you do something like that.

dobremeno3mo ago

SPy [1] is a new attempt at something like this.

TL;DR: SPy is a variant of Python specifically designed to be statically compilable while retaining a lot of the "useful" dynamic parts of Python.

The effort is led by Antonio Cuni, Principal Software Engineer at Anaconda. Still very early days but it seems promising to me.

[1] https://github.com/spylang/spy

mattclarkdotnet3mo ago

Thank you! Spy looks brilliant, especially the comptime-like freezing after import.

musicale3mo ago

> Python really needs to take the Typescript approach of "all valid Python4 is valid Python3

Great idea, but I'm not convinced that they learned anything from the Python 2 to 3 transition, so I wouldn't hold my breath.

If you want a language system without contempt for backward compatibility, you're probably better off with Java/C++/JavaScript/etc. (though using JS libraries is like building on quicksand.) Bit of a shame since I want to like Python/Rust/Swift/other modern-ish languages, but it turns out that formal language specifications were actually a pretty good idea. API stability is another.

musicale3mo ago

is that you, python core dev team? ;-)

1 more reply

BiteCode_dev3mo ago

There will be not Python 4, and 3.X policy requires forward compat, so we are already there.

mattclarkdotnet3mo ago

Oh, and while we're at it, fix the "empty array is instantiated at parse time so all your functions with a default empty array argument share the same object" bullshit.

zahlman3mo ago

We don't call them "arrays".

It has nothing to do with whether the list is empty. It has nothing to do with lists at all. It's the behaviour of default arguments.

It happens at the time that the function object is created, which is during runtime.

You only notice because lists are mutable. You should already prefer not to mutate parameters, and it especially doesn't make sense to mutate a parameter that has a default value because the point of mutating parameters is that the change can be seen by the caller, but a caller that uses a default value can't see the default value.

The behaviour can be used intentionally. (I would argue that it's overused intentionally; people use it to "bind" loop variables to lambdas when they should be using `functools.partial`.)

If you're getting got by this, you're fundamentally expecting Python to work in a way that Pythonistas consider not to make sense.

1 more reply

Izkata3mo ago

Execution time, not parse time. It's a side effect of function declarations being statements that are executed, not the list/dict itself. It would happen with any object.

2 more replies

exyi3mo ago

If you change this you break a common optimization:

https://github.com/python/cpython/blob/3.14/Lib/json/encoder...

Default value is evaluated once, and accessing parameter is much cheaper than global

zeratax3mo ago

there is PEP 671 for that, which introduces extra syntax for the behavior you want. people rely on the current behavior so you can't really change it

giancarlostoro3mo ago

I went sort of this route in an experiment with Claude.. I really want Python for .NET but I said, damn the expense, prioritize .NET compatibility, remove anything that isn't supported feasably. It means 0 python libs, but all of NuGet is supported. The rules are all signatures need types, and if you declare a type, it is that type, no exceptions, just like in C# (if you squint when looking at var in a funny way). I wound up with reasonable results, just a huge trade of the entire Python ecosystem for .NET with an insanely Python esque syntax.

Still churning on it, will probably publish it and do a proper blog post once I've built something interesting with the language itself.

coredog643mo ago

IronPython -> TitaniumPython?

ecshafer3mo ago· 16 in thread

What is wrong with the Python code base that makes this so much harder to implement than seemingly all other code bases? Ruby, PHP, JS. They all seemed to add JITs in significantly less time. A Python JIT has been asked for for like 2 decades at this point.

0cf8612b2e1e3mo ago

The Python C api leaks its guts. Too much of the internal representation was made available for extensions and now basically any change would be guaranteed to break backwards compatibility with something.

echelon3mo ago

It's a shame that Python 2->3 transition was so painful, because Python could use a few more clean breaks with the past.

This would be a potential case for a new major version number.

1 more reply

patmorgan233mo ago

Ooo this makes sense it's like if the Linux had don't break users space AND a whole bunch of other purely internal APIs you also can't refactor.

hardwaregeek3mo ago

For what it’s worth Ruby’s JIT took several different implementations, definitely struggled with Rails compatibility and literally used some people’s PhD research. It wasn’t a trivial affair

stmw3mo ago

Some languages are much harder to compile well to machine code. Some big factors (for any languages) are things like: lack of static types and high "type uncertainty", other dynamic language features, established inefficient extension interfaces that have to be maintained, unusual threading models...

RussianCow3mo ago

That makes sense if you're comparing with Java or C#, but not Ruby, which is way more dynamic than Python.

The more likely reason is that there simply hasn't been that big a push for it. Ruby was dog slow before the JIT and Rails was very popular, so there was a lot of demand and room for improvement. PHP was the primary language used by Facebook for a long time, and they had deep pockets. JS powers the web, so there's a huge incentive for companies like Google to make it faster. Python never really had that same level of investment, at least from a performance standpoint.

To your point, though, the C API has made certain types of optimizations extremely difficult, as the PyPy team has figured out.

2 more replies

simonask3mo ago

The simplest JIT just generates the machine code instructions that the interpreter loop would execute anyway. It’s not an extremely difficult thing, but it also doesn’t give you much benefit.

A worthwhile JIT is a fully optimizing compiler, and that is the hard part. Language semantics are much less important - dynamic languages aren’t particularly harder here, but the performance roof is obviously just much lower.

1 more reply

kelvinjps3mo ago

I think that it's just that python people took the problem different, they made working with c and other languages better, and just made bindings for python and offloaded the performant code to these libraries. Ex: numpy

fleetfox3mo ago

I can't really talk about Ruby. But PHP is much more static and surface of things you have to care about at runtime is like magnitude smaller and there already was opache as a starting point. And speaking of something like JIT in V8 is of the most sophisticated and complicated ever built. There hasn't been near enough man hours and funding to cpython to make it fair comparison

fridder3mo ago

For better or for worse they have been very consistent throughout the years that they don't want want to degrade existing performance. It is why the GIL existed for so long

bawolff3mo ago

I thought php hasn't shipped jit yet (as in its behind a disabled by default config)

SahAssar3mo ago

PHP 8 shipped with JIT on by default unless I'm mistaken.

1 more reply

wat100003mo ago

PHP and JS had huge tech companies pouring resources into making them fast.

brokencode3mo ago

Are you forgetting about PyPy, which has existed for almost 2 decades at this point?

RussianCow3mo ago

That's a completely separate codebase that purposefully breaks backwards compatibility in specific areas to achieve their goals. That's not the same as having a first-class JIT in CPython, the actual Python implementation that ~everyone uses.

1 more reply

g947o3mo ago

Money.

owaislone3mo ago· 13 in thread

Oh man, Python 2 > 3 was such a massive shift. Took almost half a decade if not more and yet it mainly changing superficial syntax stuff. They should have allowed ABIs to break and get these internal things done. Probably came up with a new, tighter API for integrating with other lower level languages so going forward Python internals can be changed more freely without breaking everything.

scorpioxy3mo ago

The text encoding stuff wasn't a small change considering what it could break, at least. And remember we're sometimes talking about software that would cost a lot of money to migrate or upgrade. I still maintain some 2.x python code-bases that will be very expensive to migrate and the customer is not willing to invest that money.

Although your general sentiment is something I agree with(if it's going to be painful do it and get it over with), I don't believe anybody knew or could've guessed what the reaction of the ecosystem would be.

Your last point about being able to change internals more freely is also great in theory but very difficult(if not impossible) to achieve in practice.

I don't know. Having maintained some small projects that were free and open source, I saw the hostility and entitlement that can come from that position. And those projects were a spec of dust next to something like Python. So I think the core team is doing the best they can. It was always going to be damned if you do, damned if you don't.

eru3mo ago

> I still maintain some 2.x python code-bases that will be very expensive to migrate and the customer is not willing to invest that money.

Slight tangent: if Claude can decimate IBM stock price by migrating off Cobol for cheap, surely we can do Python 2 to 3 now, too?

About the internals: we sort of missed an opportunity there, but back then there also didn't quite know what they were doing (or at least we have better ideas of what's useful today). And making the step from 2 to 3 even bigger might have been a bad idea?

2 more replies

smcl3mo ago

I cannot believe people are still acting like Python 2->3 was a huge fuck-up and an enormous missed opportunity. When in reality Python is by most measures the most popular language and became so AFTER that switch.

Since the switch we have seen enormous companies being built from scratch. There is no reason for anyone to be complaining about it being too hard to upgrade in 2026

rtpg3mo ago

Living through it... Python 3 made a lot of changes for the better but 3.0 in particular included a bunch of unforced errors that made it too hard for people to upgrade in one go.

It wasn't until much later (I would say 3.4 or 3.5?) that we had good tooling to allow for migrating from Python 2 to Python 3 gradually, which is what most tools needed to do.

The final thing that made Python upgrading easy was making a bunch of changes (along with stuff like six) so that you could write code that would run identically in Python 2 and Python 3. That lets you do refactors over time, little cleanups, and not have the huge "move to Python 3" commit.

badsectoracula3mo ago

> Python is by most measures the most popular language and became so AFTER that switch

The switch had nothing to do with Python's rise in popularity though, it was because of NumPy and later PyTorch being adopted by data scientist and later machine learning tasks that themselves became very popular. Python's popularity rose alongside those.

> There is no reason for anyone to be complaining about it being too hard to upgrade in 2026

The "complaints" are about unnecessary and pointless breakage, that was very difficult for many codebases to upgrade for years. That by now most of these codebases have been either abandoned, upgraded or decided to stick with Python2 until the end of time doesn't mean these pains didn't happen nor that the language's developers inflicting them to their users were a good idea because some largely unrelated external factors made the language popular several years later.

2 more replies

20k3mo ago

It took a long time for python 3 to add the necessary backwards compatibility features to allow people to switch over. Once they did it was fine, but it was a massive fuck up until then. The migration took far longer than it should have done

Its widely regarded as a disaster for good reason, that forced some corrections in python to fix it. Just because its fine now, does not mean it was always fine

1 more reply

bmitc3mo ago

Those are unrelated.

1 more reply

nurettin3mo ago

The biggest (and worst planned) change was module names. Your imports didn't work, forcing hacks like

    if sys.version_info.major == 2:
        import old
    else:
        import new

Or worse, people used try/except in their imports.

jmspring3mo ago

still GIL

marcyb5st3mo ago

Opt-in starting from 3.15, or am I mistaken?

Anyway you can already try freethreaded builds that have the GIL disabled, but my experience is that most of your dependencies won't work.

gjvc3mo ago

yes. it was not a massive shift. it was barely worth the effort.

pansa23mo ago

The Python devs didn’t want to make huge changes because they were worried Python 3 would end up taking forever like Perl 6. Instead they went to the other extreme and broke everyone’s code for trivial reasons and minimal benefit, which meant no-one wanted to upgrade.

Even the main driver for Python 3, the bytes-Unicode split, has unfortunately turned out to be sub-optimal. Python essentially bet on UTF-32 (with space-saving optimisations), while everyone else has chosen UTF-8.

3 more replies

gjvc3mo ago

this must be right, i'm getting downvoted

2 more replies

adrian173mo ago· 11 in thread

I'm been occasionally glancing at PR/issue tracker to keep up to date with things happening with the JIT, but I've never seen where the high level discussions were happening; the issues and PRs always jumped right to the gritty details. Is there anywhere a high-level introduction/example of how trace projection vs recording work and differ? Googling for the terms often returns CPython issue tracker as the first result, and repo's jit.md is relatively barebones and rarely updated :(

Similarly, I don't entirely understand refcount elimination; I've seen the codegen difference, but since the codegen happens at build time, does this mean each opcode is possibly split into two (or more?) stencils, with and without removed increfs/decrefs? With so many opcodes and their specialized variants, how many stencils are there now?

VagabundoP2mo ago

Occasionally Core.py will do some updates, higher level stuff:

https://open.spotify.com/show/1PGRfdrLEwgXjQbPBNk1pW

pablo and Łukasz

kenjin40963mo ago

> I've never seen where the high level discussions were happening

Thanks for your interest. This is something we could improve on. We were supposed to document the JIT better in 3.15, but right now we're crunching for the 3.15 release. I'll try to get to updating the docs soon if there's enough interest. PEP 744 does not document the new frontend.

I wrote a somewhat high-level overview here in a previous blog post https://fidget-spinner.github.io/posts/faster-jit-plan.html#...

> does this mean each opcode is possibly split into two (or more?) stencils, with and without removed increfs/decrefs?

This is a great question, the answer is not exactly! The key is to expose the refcount ops in the intermediate representation (IR) as one single op. For example, BINARY_OP becomes BINARY_OP, POP_TOP (DECREF), POP_TOP (DECREF). That way, instead of optimizing for n operations, we just need to expose refcounting of n operations and optimize only 1 op (POP_TOP). Thus, we just need to refactor the IR to expose refcounting (which was the work I divided up among the community).

If you have any more questions, I'm happy to answer them either in public or email.

adrian173mo ago

I saw your documentation PR, thank you!

I also did some reading and experiments, so quickly talking about things I've found out re: refcount elimination:

Previously given an expression `c = a + b`, the compiler generated a sequence of two LOADs (that increment the inputs' refcounts), then BINARY_OP that adds the inputs and decrements the refcounts afterwards (possibly deallocating the inputs).

But if the optimizer can prove that the inputs definitely will have existing references after the addition finishes (like when `a` and `b` are local variables, or if they are immortals like `a+5`), then the entire incref/decref pair could be ignored. So in the new version, the DECREFs part of the BINARY_OP was split into separate uops, which are then possibly transformed into POP_TOP_NOP by the optimizer.

And I'm assuming that although normally splitting an op this much would usually cost some performance (as the compiler can't optimize them as well anymore), in this case it's usually worth it as the optimization almost always succeeds, and even if it doesn't, the uops are still generated in several variants for various TOS cache (which is basically registers) states so they still often codegen into just 1-2 opcodes on x86.

One thing I don't entirely understand, but that's super specific from my experiment, not sure if it's a bug or special case: I looked at tier2 traces for `for i in lst: (-i) + (-i)`, where `i` is an object of custom int-like class with overloaded methods (to control which optimizations happen). When its __neg__ returns a number, then I see a nice sequence of

_POP_TOP_INT_r32, _r21, _r10.

But when __neg__ returns a new instance of the int-like class, then it emits

_SPILL_OR_RELOAD_r31, _POP_TOP_r10, _SPILL_OR_RELOAD_r01, _POP_TOP_r10, etc.

Is there some specific reason why the "basic" pop is not specialized for TOS cache? Is it because it's the same opcode as in tier1, and it's just not worth it as it's optimized into specialized uops most of the time; or is it that it can't be optimized the same way because of the decref possibly calling user code?

kenjin40963mo ago

Update: I put up a PR to document the trace recording interpreter https://github.com/python/cpython/pull/146110

flakes3mo ago

You’ll probably want to look to the PEPs. Havent dug into this topic myself but looks related https://peps.python.org/pep-0744/

adrian173mo ago

I think CPython already had tier2 and some tracing infrastructure when the copy-and-patch JIT backend was added; it's the "JIT frontend" that's more obscure to me.

rtpg3mo ago

discussions might be happening on the Python forums, which are pretty active.

https://discuss.python.org/t/pep-744-jit-compilation/50756/8... here's one thing

I do think you can also just outright ask questions about it on the forums and you'll get some answers.

At the end of the day there's only so many people working on this though.

saikia813mo ago

have you read the dev mailing list? There the developers of python discuss lots.

pansa23mo ago

There isn’t a dev mailing list any more, is there? Do you mean the Discord forum?

sheepscreek3mo ago

UPDATE: I misunderstood the question :-/ You can ignore this.

I love playing with compilers for fun, so maybe I can shed some light. I’ll explain it in a simplified way for everyone’s benefit (going to ignore the stack):

When an object is passed between functions in Python, it doesn’t get copied. Instead, a reference to the object’s memory address is sent. This reference acts as a pointer to the object’s data. Think of it like a sticky note with the object’s memory address written on it. Now, imagine throwing away one sticky note every time a function that used a reference returns.

When an object has zero references, it can be freed from memory and reused. Ensuring the number of references, or the “reference count” is always accurate is therefore a big deal. It is often the source of memory leaks, but I wouldn’t attribute it to a speed up (only if it replaces GC, then yes).

yuliyp3mo ago

what at all does this comment have to do with what it's replying to?

1 more reply

rslashuser3mo ago· 7 in thread

I'm curious is the JIT developers could mention any Python features that prevent promising JIT features. An earlier Ken Jin blog [1], mentions how __del__ complicates reference counting optimization.

There is a story that Python is harder to optimize than, say, Typescript, with Python flexibility and the C API getting mentioned. Maybe, if the list of troublesome Python features was out there, programmers could know to avoid those features with the promise of activating the JIT when it can prove the feature is not in use. This could provide a way out of the current Python hard-to-JIT trap. It's just a gist of an idea, but certainly an interesting first step would be to hear from the JIT people which Python features they find troublesome.

[1] https://fidget-spinner.github.io/posts/faster-jit-plan.html

rtpg3mo ago

It's interesting you mention __del__ because Javascript not only doesn't have destructors but for security reasons (that are above my pay grade) but the spec _explicitly prohibits_ implementations from allowing visibility into garbage collection state, meaning that code cannot have any visibility into deallocations.

I think __del__ is tricky though. In theory __del__ is not meant to be reliable. In practice CPython reliably calls it cuz it reference counts. So people know about it and use it (though I've only really seen it used for best effort cleanup checks)

In a world where more people were using PyPy we could have pressure from that perspective to avoid leaning into it. And that would also generate more pressure to implement code that is performant in "any" system.

cpgxiii3mo ago

> In practice CPython reliably calls it cuz it reference counts ... In a world where more people were using PyPy we could have pressure from that perspective to avoid leaning into it

A big part of the problem is that much of the power of the Python ecosystem comes specifically from extensions/bindings written in languages with manual (C) or RAII/ref-counted (C++, Rust) memory management, and having predictable Python-level cleanup behavior can be pretty necessary to making cleanup behavior in bound C/C++/Rust objects work. Breaking this behavior or causing too much of a performance hit is basically a non-starter for a lot of Python users, even if doing so would improve the performance of "pure" Python programs.

1 more reply

nvme0n1p13mo ago

> code cannot have any visibility into deallocations

Doesn't FinalizationRegistry let you do exactly that?

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

2 more replies

jonathanlydall3mo ago

> meaning that code cannot have any visibility into deallocations.

This is more pedantry than a serious question. JavaScript has WeakReference, sure it'd be cumbersome and inefficient because you'd need to manually make and poll each thing you wanted to observe, but could it not be said that it does provide a view on deallocations?

1 more reply

adgjlsfhk13mo ago

The biggest thing is BigInt by default. It makes every integer operation require an overflow check.

ridiculous_fish3mo ago

JS (when using ints, which v8 does) is the same in this respect.

kstrauser3mo ago

Huh, I could imagine that as a set of Ruff rules:

> Using str.frobnicate prevents TurboJit on line 63

ekjhgkejhgk3mo ago· 7 in thread

Doesn't PyPy already have a jit compiler? Why aren't we using that?

olivia-banks3mo ago

As far as I know, PyPy doesn't support all CPython extensions, so pure Python code will probably (very likely) run fine but for other things most bets are off. I believe PyPy also only supports up to 3.11?

contravariant3mo ago

Why shouldn't the reference implementation get JIT? Just because some other implementations already have it is no reason not to. That'd be like skipping list comprehensions because they already exist in CPython.

3laspa3mo ago

Because the same people who made a big deal about supporting PyPy and PEP 399 when it was fashionable to do so are now told by their corporations that PyPy does not matter. CPython only moves with what is currently fashionable, employer mandated and profitable.

cpburns20093mo ago

PyPy is limited to maintenance mode due to a lack of funding/contributors. In the past, I think a few contributors or funding is what helped push "minor" PyPy versions. It's too bad PyPy couldn't take the federal funding the PSF threw away.

philipallstar3mo ago

> It's too bad PyPy couldn't take the federal funding the PSF threw away.

The PSF is primarily a political advocacy organisation, so it wouldn't make sense for them to use the money for Python.

JoshTriplett3mo ago

Because PyPy seems to be defunct. It hasn't updated for quite a while.

See https://github.com/numpy/numpy/issues/30416 for example. It's not being updated for compatibility with new versions of Python.

mkl3mo ago

PyPy's devs disagree: https://news.ycombinator.com/item?id=47293415

thunky3mo ago· 7 in thread

I always wanted this for Python but now that machines write code instead of humans I feel like languages like Python will not be needed as much anymore. They're made for humans, not machines. If a machine is going to do the dirty work I want it to produce something lean, fast, and strictly verified.

bigstrat20033mo ago

> now that machines write code instead of humans

That is not remotely the case for anyone who produces quality work.

thunky3mo ago

Look again.

If you care about quality you absolutely can guide a machine to produce that for you without writing a single line of code yourself.

And I expect the amount of guidance needed will continue to drop.

zahlman3mo ago

We got daguerrotypes, and then photographic film, and then digital cameras, along with image editing software, and now AI image generation systems; yet there are still people who go out and apply oil paints to a canvas with natural hair brushes. I'm not willing to lose that.

ddorian433mo ago

AI, write me that sqlalchemy clone in <lang>

JodieBenitez3mo ago

Pretty much my thoughts the other day... now that Codex does the writing, maybe I can finally switch to Go for the web backend stuff without being annoyed by some of its archaisms and gain significant execution performance, while still having a relatively easy to read language.

kccqzy3mo ago

You ask a machine to write your code and you still care about being easy to read?

In my experience the people who care the most about code readability tend to be the people most opinionated on having the right abstractions, which are historically not available in Go.

2 more replies

brianwawok3mo ago

I have shifted as much as I can python to go when I don’t code. It’s just faster and the compiler catches more errors, win win,

1 more reply

oystersareyum3mo ago· 6 in thread

> We don’t have proper free-threading support yet, but we’re aiming for that in 3.15/3.16. The JIT is now back on track.

I recently read an interview about implementing free-threading and getting modifications through the ecosystem to really enable it: https://alexalejandre.com/programming/interview-with-ngoldba...

The guy said he hopes the free-threaded build'll be the only one in "3.16 or 3.17", I wonder if that should apply to the JIT too or how the JIT and interpreter interact.

zarzavat3mo ago

I continue to believe that free-threading hurts performance more than it helps and Python should abandon it.

Having to have thread safe code all over the place just for the 1% of users who need to have multi-threading in Python and can't use subinterpreters for some reason is nuts.

cpgxiii3mo ago

> Having to have thread safe code all over the place just for the 1% of users who need to have multi-threading in Python and can't use subinterpreters for some reason is nuts.

Way more than 1% of the community, particularly of the community actively developing Python, wants free-threaded. The problem here is that the Python community consists of several different groups:

1. Basically pure Python code with no threading

2. Basically pure Python with appropriate thread safety

3. Basically pure Python code with already broken threaded code, just getting lucky for now

4. Mixed Python and C/C++/Rust code, with appropriate threading behavior in the C or C++ components

5. Mixed Python and C or C++ code, with C and C++ components depending on GIL behavior

Group 1 gets a slightly reduced performance. Groups 2 and 4 get a major win with free-threaded Python, being able to use threading through their interfaces to C/C++/Rust components. Group 3 is already writing buggy code and will probably see worse consequences from their existing bugs. Group 5 will have to either avoid threading in their Python code or rewrite their C/C++ components.

Right now, a big portion of the Python language developer base consists of Groups 2 and 4. Group 5 is basically perceived as holding Python-the-language and Python-the-implementations back.

1 more reply

pansa23mo ago

Maybe they could have two versions of the interpreter, one that’s thread-safe and one that’s optimised for single-threading?

Microsoft used to do this for their C runtime library.

2 more replies

kzrdude3mo ago

I don't want to go too heavy on the negatives, but what's nuts is Python going for trust-the-programmer style multithreading. The risk is that extension modules could cause a lot of crashes.

1 more reply

zadikian3mo ago

Pure Python code always needed mutexes for thread safety with or without ol' GIL. I thought the difficulty with removing the GIL instead had to do with C extensions that rely on it.

1 more reply

reinhash3mo ago

I also wonder how many people actually need free-threading. And I wonder how useful it will be, when you can already use the ABI to call multi-threaded code.

I think the GIL provides python with a great guarantee, I would probably prefer single-thread performance improvements over multithreading in python to be honest.

Anyway if I need performance, Python would probably not be my first choice

fluidcruft3mo ago· 5 in thread

(what are blueberry, ripley, jones and prometheus?)

mkl3mo ago

Yes, the graphs are incomprehensible because those are not defined in the article. They turn out to be different physical machines with different architectures: https://doesjitgobrrr.com/about

  blueberry (aarch64)
  Description: Raspberry Pi 5, 8GB RAM, 256GB SSD
  OS: Debian GNU/Linux 12 (bookworm)
  Owner: Savannah Ostrowski

  ripley (x86_64)
  Description: Intel i5-8400 @ 2.80GHz, 8GB RAM, 500GB SSD
  OS: Ubuntu 24.04
  Owner: Savannah Ostrowski

  jones (aarch64)
  Description: Apple M3 Pro, 18GB RAM, 512GB SSD
  OS: macOS
  Owner: Savannah Ostrowski

  prometheus (x86_64)
  Description: AMD Ryzen 5 3600X @ 3.80GHz, 16GB RAM
  OS: Windows 11 Pro
  Owner: Savannah Ostrowski

max-m3mo ago

The names of the benchmark runners. https://doesjitgobrrr.com/about

fluidcruft3mo ago

So the biggest gains so far are on Windows 11 Pro of (x86_64) ~20%? Is that because Windows was bad as a baseline (promethius)? It doesn't seem like the x86_64/Linux has improved as dramatically ~5% (ripley). I'm just surprised OS has that much of an effect that can be attributed to JIT vs other OS issues.

1 more reply

nonameiguess3mo ago

The immediate question has been answered, but what about the names? The latter three are obvious references to the Alien universe, but what relationship does blueberry have to them?

luhn3mo ago

I assume Blueberry is a nod to the machine being a Raspberry Pi.

ghm21993mo ago· 3 in thread

Thanks for all the amazing work! I have Noob question. Wouldn't this get the funding back? Or would that not be preferable way to continue(as opposed to just volunteer driven)?

Like this is a big deal to get a project to a state where volunteers are spun up and actively breaking tasks and getting work done, no? It's a python JIT something I know next to nothing about — as do most application developers — which tells one how difficult this must have been.

pansa23mo ago

> Wouldn't this get the funding back?

The funding was Microsoft employing most of the team. They were laid off (or at least, moved onto different projects), apparently because they weren't working on AI.

kelvinjps3mo ago

With Python being the main language for AI, isn't like more important to be more performant? I kinda don't get Microsoft reasoning, maybe they're just tight in money

1 more reply

Ralfp3mo ago

It looks like ARM picked up plenty of those folk and pays them to continue this work.

killingtime743mo ago· 2 in thread

Sorry but the graphs are completely unreadable. There are four code names for each of the lines. Which is jit and which is cpython?

mkl3mo ago

They are all JIT on different architectures, measured relative to CPython. https://doesjitgobrrr.com/about: blueberry is aarch64 Raspberry Pi, ripley is x86_64 Intel, jones is aarch64 M3 Pro, prometheus is x86_64 AMD.

killingtime743mo ago

Thanks

a3w3mo ago· 1 in thread

Over 100% speedup sound like "the code compiled before you asked the compiler to start working".

`from future import time_travel`

quietbritishjim3mo ago

If the speed of a car increases by 100% does that mean that it arrives at its destination before it left? No, it just means it took 50% of the time it would have otherwise.

But I do agree that it would be a bit clearer to talk in terms of time taken rather than speedup % i.e. instead of "20% slowdown to over 100% speedup" it's clearer to say "takes between 50% and 125% of the original time". (Especially since people very often say things like "3 times faster", which technically means 4 times as fast, when they should say "3 times as fast"; "takes 1/3 of the time" is unambiguous.)

vanderZwan3mo ago

> However, I misunderstood and came up with an even more extreme version: instead of tracing versions of normal instructions, I had only one instruction responsible for tracing, and all instructions in the second table point to that. Yes I know this part is confusing, I’ll hopefully try to explain better one day. This turned out to be a really really good choice. I found that the initial dual table approach was so much slower due to a doubling of the size of the interpreter, causing huge compiled code bloat, and naturally a slowdown.

> By using only a single instruction and two tables, we only increase the interpreter by a size of 1 instruction, and also keep the base interpreter ultra fast. I affectionally call this mechanism dual dispatch.

I really do hope they'll write that better explanation one day because this sounds pretty intriguing all on its own.

pjmlp3mo ago

Great to see this going, Python also deserves a JIT, and given that only few bother with PyPy or GraalPy, shipping into the CPYthon is the only way to have less "rewrite into XYZ".

Kudos to those involved into making it happen.

2 more replies

j / k navigate · click thread line to collapse

310 comments

130 comments · 14 top-level

mattclarkdotnet3mo ago· 38 in thread

mattclarkdotnet3mo ago

To clarify, it is nuts that in an object method, there is a performance enhancement through caching a member value.

  class SomeClass
    def init(self)
      self.x = 0
    def SomeMethod(self)
      q = self.x
      ## do stuff with q, because otherwise you're dereferencing self.x all the damn time

17186274403mo ago

1 more reply

dekelpilli3mo ago

Java also has a performance cost to accessing class fields, as exampled by this (now-replaced) code in the JDK itself - https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/...

2 more replies

mathisfun1233mo ago

> it is nuts that in an object method, there is a performance enhancement through caching a member value

lots of hottakes that people have (like this one) are rooted in just a fundamental misunderstanding of the language and programming languages in general <shrugs>.

4 more replies

duskdozer3mo ago

You mean even if x is not a property?

stabbles3mo ago

Hendrikto3mo ago

Being a superset would mean all valid Python 3 is valid Python 4. A valuable property for sure, but not what OP suggested. In fact, it is the exact opposite.

bloppe3mo ago

But that's just not what python is for. Move your performance-critical logic into a native module.

mattclarkdotnet3mo ago

2 more replies

wiseowise3mo ago

I’ll be happy if over night all Python code in the world can reap 10-100x performance benefits without changing much of a codebase, you can continue having soup of multiple languages.

3 more replies

fermigier3mo ago

I have made some experiments with P2W, my experimental Python (subset) to WASM compiler. Initial figures are encouraging (5x speedup, on specific programs).

https://github.com/abilian/p2w

NB: some preliminary results:

  p2w is 4.03x SLOWER than gcc (geometric mean)

  p2w is 5.50x FASTER than cpython (geometric mean)

  p2w is 1.24x FASTER than pypy (geometric mean)

BerislavLopac3mo ago

> python code could be so much faster if it didn't have to assume everything could change at any time

Definitely, but then it wouldn't be Python. One of the core principles of Python's design is to be extremely dynamic, and that anything can change at any time.

There are many other, pretty good, strictly dynamically typed languages which work just as well if not better than Python, for many purposes.

oblio3mo ago

I feel that this excuse is being trotted out too much. Most engineers never get to choose the programming language used for 90% of their professional projects.

And when Python is a mainstream language on top of which large, globally known websites, AI tools, core system utilities, etc are built, we should give up the purity angle and be practical.

Even the new performance push in Python land is a reflection of this. A long time ago some optimizations were refused in order to not complicate the default Python implementation.

1 more reply

wolvesechoes3mo ago

> Python really needs to take the Typescript approach of "all valid Python4 is valid Python3"

It is called type hints, and is already there. TS typing doesn't bring any perf benefits over plain JS.

stabbles3mo ago

You really need dedicated types for `int64` and something like `final`. Consider:

    class Foo:
      __slots__ = ("a", "b")
      a: int
      b: float

there are multiple issues with Python that prevent optimizations:

* a user can define subtype `class my_int(int)`, so you cannot optimize the layout of `class Foo`

* the builtin `int` and `float` are big-int like numbers, so operations on them are branchy and allocating.

and the fact that Foo is mutable and that `id(foo.a)` has to produce something complicates things further.

1 more reply

phkahler3mo ago

>> Sensible type-annotated python code could be so much faster if it didn't have to assume everything could change at any time.

Then it wouldn't be Python any more.

0xffff23mo ago

1 more reply

germandiago3mo ago

I share your view. Python's flexibility is central to Python.

Even type annotations, though useful, can get in the way for certain tasks.Betting on things like these to speed up things would be a mistake, since it would kind of force you to follow that style.

Anything that accelearates things should rely on run-time data, not on type annotations that won't change.

panzi3mo ago

Isn't rpython doing that, allowing changes on startup and then it's basically statically typed? Does it still exist? Was it ever production ready? I only once read a paper about it decades ago.

zahlman3mo ago

It exists in the sense that PyPy exists.

As far as I can tell, it only ever existed to make PyPy possible, and was only defined/specified in terms of PyPy's needs.

mattclarkdotnet3mo ago

RPython is great, but it changes semantics in all sorts of ways. No sets for example. WTF? The native Set type is one of the best features of Python. Tuples also get mangled in RPython.

rich_sasha3mo ago

I think sadly a lot of Python in the wild relies heavily, somewhere, on the crazy unoptimisable stuff. For example pytest monkey patches everything everywhere all the time.

You could make this clean break and call it Python 4 but frankly I fear it won't be Python anymore.

fyrn_3mo ago

1 more reply

germandiago3mo ago

If you do that you then have a less productive language for many use cases IMHO.

All the dynamism from Python should stay where it is.

Just JIT and remember a type maybe, but do not force a type from a type hint or such things.

As a minimum, I would say not relying on that is the correct thing. You could exploit it, but not force it to change the semantics.

1 more reply

mattclarkdotnet3mo ago

NetMageSCW3mo ago

Perl 6 showed what happens when you do something like that.

dobremeno3mo ago

SPy [1] is a new attempt at something like this.

TL;DR: SPy is a variant of Python specifically designed to be statically compilable while retaining a lot of the "useful" dynamic parts of Python.

The effort is led by Antonio Cuni, Principal Software Engineer at Anaconda. Still very early days but it seems promising to me.

[1] https://github.com/spylang/spy

mattclarkdotnet3mo ago

Thank you! Spy looks brilliant, especially the comptime-like freezing after import.

musicale3mo ago

> Python really needs to take the Typescript approach of "all valid Python4 is valid Python3

Great idea, but I'm not convinced that they learned anything from the Python 2 to 3 transition, so I wouldn't hold my breath.

musicale3mo ago

is that you, python core dev team? ;-)

1 more reply

BiteCode_dev3mo ago

There will be not Python 4, and 3.X policy requires forward compat, so we are already there.

mattclarkdotnet3mo ago

Oh, and while we're at it, fix the "empty array is instantiated at parse time so all your functions with a default empty array argument share the same object" bullshit.

zahlman3mo ago

We don't call them "arrays".

It has nothing to do with whether the list is empty. It has nothing to do with lists at all. It's the behaviour of default arguments.

It happens at the time that the function object is created, which is during runtime.

The behaviour can be used intentionally. (I would argue that it's overused intentionally; people use it to "bind" loop variables to lambdas when they should be using `functools.partial`.)

If you're getting got by this, you're fundamentally expecting Python to work in a way that Pythonistas consider not to make sense.

1 more reply

Izkata3mo ago

Execution time, not parse time. It's a side effect of function declarations being statements that are executed, not the list/dict itself. It would happen with any object.

2 more replies

exyi3mo ago

If you change this you break a common optimization:

https://github.com/python/cpython/blob/3.14/Lib/json/encoder...

Default value is evaluated once, and accessing parameter is much cheaper than global

zeratax3mo ago

there is PEP 671 for that, which introduces extra syntax for the behavior you want. people rely on the current behavior so you can't really change it

giancarlostoro3mo ago

Still churning on it, will probably publish it and do a proper blog post once I've built something interesting with the language itself.

coredog643mo ago

IronPython -> TitaniumPython?

ecshafer3mo ago· 16 in thread

0cf8612b2e1e3mo ago

echelon3mo ago

It's a shame that Python 2->3 transition was so painful, because Python could use a few more clean breaks with the past.

This would be a potential case for a new major version number.

1 more reply

patmorgan233mo ago

Ooo this makes sense it's like if the Linux had don't break users space AND a whole bunch of other purely internal APIs you also can't refactor.

hardwaregeek3mo ago

For what it’s worth Ruby’s JIT took several different implementations, definitely struggled with Rails compatibility and literally used some people’s PhD research. It wasn’t a trivial affair

stmw3mo ago

RussianCow3mo ago

That makes sense if you're comparing with Java or C#, but not Ruby, which is way more dynamic than Python.

To your point, though, the C API has made certain types of optimizations extremely difficult, as the PyPy team has figured out.

2 more replies

simonask3mo ago

The simplest JIT just generates the machine code instructions that the interpreter loop would execute anyway. It’s not an extremely difficult thing, but it also doesn’t give you much benefit.

1 more reply

kelvinjps3mo ago

fleetfox3mo ago

fridder3mo ago

For better or for worse they have been very consistent throughout the years that they don't want want to degrade existing performance. It is why the GIL existed for so long

bawolff3mo ago

I thought php hasn't shipped jit yet (as in its behind a disabled by default config)

SahAssar3mo ago

PHP 8 shipped with JIT on by default unless I'm mistaken.

1 more reply

wat100003mo ago

PHP and JS had huge tech companies pouring resources into making them fast.

brokencode3mo ago

Are you forgetting about PyPy, which has existed for almost 2 decades at this point?

RussianCow3mo ago

1 more reply

g947o3mo ago

Money.

owaislone3mo ago· 13 in thread

scorpioxy3mo ago

Your last point about being able to change internals more freely is also great in theory but very difficult(if not impossible) to achieve in practice.

eru3mo ago

> I still maintain some 2.x python code-bases that will be very expensive to migrate and the customer is not willing to invest that money.

Slight tangent: if Claude can decimate IBM stock price by migrating off Cobol for cheap, surely we can do Python 2 to 3 now, too?

2 more replies

smcl3mo ago

Since the switch we have seen enormous companies being built from scratch. There is no reason for anyone to be complaining about it being too hard to upgrade in 2026

rtpg3mo ago

Living through it... Python 3 made a lot of changes for the better but 3.0 in particular included a bunch of unforced errors that made it too hard for people to upgrade in one go.

It wasn't until much later (I would say 3.4 or 3.5?) that we had good tooling to allow for migrating from Python 2 to Python 3 gradually, which is what most tools needed to do.

badsectoracula3mo ago

> Python is by most measures the most popular language and became so AFTER that switch

> There is no reason for anyone to be complaining about it being too hard to upgrade in 2026

2 more replies

20k3mo ago

Its widely regarded as a disaster for good reason, that forced some corrections in python to fix it. Just because its fine now, does not mean it was always fine

1 more reply

bmitc3mo ago

Those are unrelated.

1 more reply

nurettin3mo ago

The biggest (and worst planned) change was module names. Your imports didn't work, forcing hacks like

    if sys.version_info.major == 2:
        import old
    else:
        import new

Or worse, people used try/except in their imports.

jmspring3mo ago

still GIL

marcyb5st3mo ago

Opt-in starting from 3.15, or am I mistaken?

Anyway you can already try freethreaded builds that have the GIL disabled, but my experience is that most of your dependencies won't work.

gjvc3mo ago

yes. it was not a massive shift. it was barely worth the effort.

pansa23mo ago

3 more replies

gjvc3mo ago

this must be right, i'm getting downvoted

2 more replies

adrian173mo ago· 11 in thread

VagabundoP2mo ago

Occasionally Core.py will do some updates, higher level stuff:

https://open.spotify.com/show/1PGRfdrLEwgXjQbPBNk1pW

pablo and Łukasz

kenjin40963mo ago

> I've never seen where the high level discussions were happening

I wrote a somewhat high-level overview here in a previous blog post https://fidget-spinner.github.io/posts/faster-jit-plan.html#...

> does this mean each opcode is possibly split into two (or more?) stencils, with and without removed increfs/decrefs?

If you have any more questions, I'm happy to answer them either in public or email.

adrian173mo ago

I saw your documentation PR, thank you!

I also did some reading and experiments, so quickly talking about things I've found out re: refcount elimination:

_POP_TOP_INT_r32, _r21, _r10.

But when __neg__ returns a new instance of the int-like class, then it emits

_SPILL_OR_RELOAD_r31, _POP_TOP_r10, _SPILL_OR_RELOAD_r01, _POP_TOP_r10, etc.

kenjin40963mo ago

Update: I put up a PR to document the trace recording interpreter https://github.com/python/cpython/pull/146110

flakes3mo ago

You’ll probably want to look to the PEPs. Havent dug into this topic myself but looks related https://peps.python.org/pep-0744/

adrian173mo ago

I think CPython already had tier2 and some tracing infrastructure when the copy-and-patch JIT backend was added; it's the "JIT frontend" that's more obscure to me.

rtpg3mo ago

discussions might be happening on the Python forums, which are pretty active.

https://discuss.python.org/t/pep-744-jit-compilation/50756/8... here's one thing

I do think you can also just outright ask questions about it on the forums and you'll get some answers.

At the end of the day there's only so many people working on this though.

saikia813mo ago

have you read the dev mailing list? There the developers of python discuss lots.

pansa23mo ago

There isn’t a dev mailing list any more, is there? Do you mean the Discord forum?

sheepscreek3mo ago

UPDATE: I misunderstood the question :-/ You can ignore this.

I love playing with compilers for fun, so maybe I can shed some light. I’ll explain it in a simplified way for everyone’s benefit (going to ignore the stack):

yuliyp3mo ago

what at all does this comment have to do with what it's replying to?

1 more reply

rslashuser3mo ago· 7 in thread

I'm curious is the JIT developers could mention any Python features that prevent promising JIT features. An earlier Ken Jin blog [1], mentions how __del__ complicates reference counting optimization.

[1] https://fidget-spinner.github.io/posts/faster-jit-plan.html

rtpg3mo ago

cpgxiii3mo ago

> In practice CPython reliably calls it cuz it reference counts ... In a world where more people were using PyPy we could have pressure from that perspective to avoid leaning into it

1 more reply

nvme0n1p13mo ago

> code cannot have any visibility into deallocations

Doesn't FinalizationRegistry let you do exactly that?

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

2 more replies

jonathanlydall3mo ago

> meaning that code cannot have any visibility into deallocations.

1 more reply

adgjlsfhk13mo ago

The biggest thing is BigInt by default. It makes every integer operation require an overflow check.

ridiculous_fish3mo ago

JS (when using ints, which v8 does) is the same in this respect.

kstrauser3mo ago

Huh, I could imagine that as a set of Ruff rules:

> Using str.frobnicate prevents TurboJit on line 63

ekjhgkejhgk3mo ago· 7 in thread

Doesn't PyPy already have a jit compiler? Why aren't we using that?

olivia-banks3mo ago

contravariant3mo ago

3laspa3mo ago

cpburns20093mo ago

philipallstar3mo ago

> It's too bad PyPy couldn't take the federal funding the PSF threw away.

The PSF is primarily a political advocacy organisation, so it wouldn't make sense for them to use the money for Python.

JoshTriplett3mo ago

Because PyPy seems to be defunct. It hasn't updated for quite a while.

See https://github.com/numpy/numpy/issues/30416 for example. It's not being updated for compatibility with new versions of Python.

mkl3mo ago

PyPy's devs disagree: https://news.ycombinator.com/item?id=47293415

thunky3mo ago· 7 in thread

bigstrat20033mo ago

> now that machines write code instead of humans

That is not remotely the case for anyone who produces quality work.

thunky3mo ago

Look again.

If you care about quality you absolutely can guide a machine to produce that for you without writing a single line of code yourself.

And I expect the amount of guidance needed will continue to drop.

zahlman3mo ago

ddorian433mo ago

AI, write me that sqlalchemy clone in <lang>

JodieBenitez3mo ago

kccqzy3mo ago

You ask a machine to write your code and you still care about being easy to read?

In my experience the people who care the most about code readability tend to be the people most opinionated on having the right abstractions, which are historically not available in Go.

2 more replies

brianwawok3mo ago

I have shifted as much as I can python to go when I don’t code. It’s just faster and the compiler catches more errors, win win,

1 more reply

oystersareyum3mo ago· 6 in thread

> We don’t have proper free-threading support yet, but we’re aiming for that in 3.15/3.16. The JIT is now back on track.

I recently read an interview about implementing free-threading and getting modifications through the ecosystem to really enable it: https://alexalejandre.com/programming/interview-with-ngoldba...

The guy said he hopes the free-threaded build'll be the only one in "3.16 or 3.17", I wonder if that should apply to the JIT too or how the JIT and interpreter interact.

zarzavat3mo ago

I continue to believe that free-threading hurts performance more than it helps and Python should abandon it.

Having to have thread safe code all over the place just for the 1% of users who need to have multi-threading in Python and can't use subinterpreters for some reason is nuts.

cpgxiii3mo ago

> Having to have thread safe code all over the place just for the 1% of users who need to have multi-threading in Python and can't use subinterpreters for some reason is nuts.

Way more than 1% of the community, particularly of the community actively developing Python, wants free-threaded. The problem here is that the Python community consists of several different groups:

1. Basically pure Python code with no threading

2. Basically pure Python with appropriate thread safety

3. Basically pure Python code with already broken threaded code, just getting lucky for now

4. Mixed Python and C/C++/Rust code, with appropriate threading behavior in the C or C++ components

5. Mixed Python and C or C++ code, with C and C++ components depending on GIL behavior

Right now, a big portion of the Python language developer base consists of Groups 2 and 4. Group 5 is basically perceived as holding Python-the-language and Python-the-implementations back.

1 more reply

pansa23mo ago

Maybe they could have two versions of the interpreter, one that’s thread-safe and one that’s optimised for single-threading?

Microsoft used to do this for their C runtime library.

2 more replies

kzrdude3mo ago

I don't want to go too heavy on the negatives, but what's nuts is Python going for trust-the-programmer style multithreading. The risk is that extension modules could cause a lot of crashes.

1 more reply

zadikian3mo ago

Pure Python code always needed mutexes for thread safety with or without ol' GIL. I thought the difficulty with removing the GIL instead had to do with C extensions that rely on it.

1 more reply

reinhash3mo ago

I also wonder how many people actually need free-threading. And I wonder how useful it will be, when you can already use the ABI to call multi-threaded code.

I think the GIL provides python with a great guarantee, I would probably prefer single-thread performance improvements over multithreading in python to be honest.

Anyway if I need performance, Python would probably not be my first choice

fluidcruft3mo ago· 5 in thread

(what are blueberry, ripley, jones and prometheus?)

mkl3mo ago

Yes, the graphs are incomprehensible because those are not defined in the article. They turn out to be different physical machines with different architectures: https://doesjitgobrrr.com/about

  blueberry (aarch64)
  Description: Raspberry Pi 5, 8GB RAM, 256GB SSD
  OS: Debian GNU/Linux 12 (bookworm)
  Owner: Savannah Ostrowski

  ripley (x86_64)
  Description: Intel i5-8400 @ 2.80GHz, 8GB RAM, 500GB SSD
  OS: Ubuntu 24.04
  Owner: Savannah Ostrowski

  jones (aarch64)
  Description: Apple M3 Pro, 18GB RAM, 512GB SSD
  OS: macOS
  Owner: Savannah Ostrowski

  prometheus (x86_64)
  Description: AMD Ryzen 5 3600X @ 3.80GHz, 16GB RAM
  OS: Windows 11 Pro
  Owner: Savannah Ostrowski

max-m3mo ago

The names of the benchmark runners. https://doesjitgobrrr.com/about

fluidcruft3mo ago

1 more reply

nonameiguess3mo ago

The immediate question has been answered, but what about the names? The latter three are obvious references to the Alien universe, but what relationship does blueberry have to them?

luhn3mo ago

I assume Blueberry is a nod to the machine being a Raspberry Pi.

ghm21993mo ago· 3 in thread

Thanks for all the amazing work! I have Noob question. Wouldn't this get the funding back? Or would that not be preferable way to continue(as opposed to just volunteer driven)?

pansa23mo ago

> Wouldn't this get the funding back?

The funding was Microsoft employing most of the team. They were laid off (or at least, moved onto different projects), apparently because they weren't working on AI.

kelvinjps3mo ago

With Python being the main language for AI, isn't like more important to be more performant? I kinda don't get Microsoft reasoning, maybe they're just tight in money

1 more reply

Ralfp3mo ago

It looks like ARM picked up plenty of those folk and pays them to continue this work.

killingtime743mo ago· 2 in thread

Sorry but the graphs are completely unreadable. There are four code names for each of the lines. Which is jit and which is cpython?

mkl3mo ago

killingtime743mo ago

Thanks

a3w3mo ago· 1 in thread

Over 100% speedup sound like "the code compiled before you asked the compiler to start working".

`from future import time_travel`

quietbritishjim3mo ago

If the speed of a car increases by 100% does that mean that it arrives at its destination before it left? No, it just means it took 50% of the time it would have otherwise.

vanderZwan3mo ago

I really do hope they'll write that better explanation one day because this sounds pretty intriguing all on its own.

pjmlp3mo ago

Great to see this going, Python also deserves a JIT, and given that only few bother with PyPy or GraalPy, shipping into the CPYthon is the only way to have less "rewrite into XYZ".

Kudos to those involved into making it happen.

2 more replies

j / k navigate · click thread line to collapse