Just out of curiosity, I'd love to hear from others who've used PyPy for their web apps. Are there any issues to look out for? I remember that a few years ago, packages like psycopg2 were not compatible, which made the migration somewhat difficult. Would love to hear real-world experiences here.
psycopg2cffi fixed the psycopg2 issue some time ago.
I've run into compatibility issues with packages that reference math-centric libs (matplotlib?) but aside from that I've been quite happy with it.
Coincidentally I tried PyPy today for my shell, which is around 14K lines of completely unoptimized Python code [1]. I have never used PyPy before, despite being a long-time Python programmer.
I didn't expect PyPy to speed it up, just based on my impressions of the kind of workloads PyPy excels at.
In my first test case (parsing a 976 line shell script), PyPy took 2.0 seconds and Cpython took 1.0. And that 2x slower number held up for other a couple other tests.
I will probably try running the benchmark in a loop to see if PyPy's JIT warms up (does it do that?). But I wasn't really expecting to use PyPy -- I just wanted to see how it does, because there aren't many ways to speed up my code without rewriting a good portion of it.
My impression is that JITs don't work well in general for certain workloads, not just PyPy. IIRC LuaJIT is actually slower than Lua for string-processing workloads. It makes a lot of sense in say machine learning applications which are all floating point calculations. But string processing is probably dominated by allocations and locality, and the JIT doesn't do very much there, whether it's LuaJIT or PyPy.
EDIT:
* Failed building psycopg2. Apparently I need to use psycopg2cffi instead. Retrying...
* manage.py now can't import Django. Hmm.
* Yeah, I have no idea why it can't import Django. I took most of the settings out of settings.py, thinking it somehow caused an ImportError, but it still fails. I think I have to give up at this point, as I have no more guesses.
* Turns out manage.py was trying to launch `/usr/bin/env python` and it needed `/usr/bin/env pypy3`. Seems odd to include a `python` binary on the image and not just alias it to `pypy3`, but it is what it is. Continuing...
Welp, everything seems to be running just fine. Here's the diff of all the changes I had to make:
https://www.pastery.net/rrjdfj/
Basically, install psycopg2cffi instead of psycopg2, use pypy3 for the interpreter instead of python, and add two lines to settings.py. All in all, pretty damn short!
Again, I haven't followed recent developments, so this may be less of an issue now.
Edit: I should add - we loved it! Pypy was like a magic bullet that solved our performance problems.
(feedback on this would be great, since someone else might have tried it recently)
We should indeed compare PyPy3.5 vs CPython 3.5.3, but having a benchmark suite that works on both continuous to be a problem.
Regarding 2.7.13 - you might find it surprising, but it's actually SLOWER than 2.7.2, there has been no speed improvements and quite a few speed decreases, so we decided to keep the faster one.
EDIT: part of the problem is that comparing PyPy 2 vs CPython 3 is apples vs oranges, but PyPy3 is not ready yet (unicode improvements I'm working on right now are missing)
I don't find it that surprising, but do find it disappointing that you would run benchmarks against the current version, but not post them online for perusal, nor provide any sort of explanation for the use of the older version in head-to-head comparisons. For me, at least, it produces the impression that PyPy has something to hide, and I doubt I'm the only one.
This is the usual case of budget - if I had budget to have anyone improve the website, improve the buildbot, improve the benchmark comparison, trust me I would do it. Right now there are no volunteers and the benchmark side is sort of lingering on.
Can you or I just run the benchmarks against 2.7.13 to see how it goes?
There are two things you can do with this. The first is you can write your performance-sensitive code directly in Cython, in which case, yes, that's a direct competitor to PyPy. (So is writing your performance-sensitive code directly in C and using Python.h to expose it to Python as a native-code module.)
The second is that you can write bindings to existing C (or C-ABI-compatible, really) code in Cython, instead of using C and Python.h to write those bindings. In that case, it's not quite that you care about the performance of your C code, but that it already exists, and you just need to call into it somehow. Having PyPy be able to use these existing codebases is valuable.
They are not so much alternatives, rather complimentary things. I guess this release allows Cython compiled modules to work directly with PyPy, which is nice.
The PyPy JIT cannot look inside C code, and crossing the python-c interface is slow, but give it a chance and you may be pleasantly surprised how fast your pure python code can run.
[0] http://pypy.readthedocs.io/en/latest/cpython_differences.htm...
EDIT: apparently it does work [1] though not officially supported until/unless it can be tested on Travis CI.
[1] https://github.com/encode/apistar/issues/130#issuecomment-29...
Tests failing:
* https://github.com/explosion/spaCy
Confirmed working:
* https://github.com/explosion/thinc
* https://github.com/explosion/preshed
* https://github.com/explosion/cymem
* https://github.com/explosion/murmurhash
I doubt spaCy will ever be faster on PyPy (the neural network library Thinc is currently 50% slower). It'd still be really great to get it running, so people who benefit from PyPy for other parts of their stack don't have to manage two Python environments.
The last update on the Pypy+Pandas wiki[0] is from this August, and it mentions that there are still 15 outstanding failing tests. Does this release mean that 5.9 is now at 100% parity? What does the same metric look like for Pypy+Numpy, and where can that one be tracked if not 100% yet?
I am looking forward to migrating some pipelines over to 5.9 soon.
[0] https://bitbucket.org/pypy/pypy/wiki/cpyext_2_-_cython_and_p...
Of course if your workflow depends on those features, they are critical. We are working on full compatibility and also on increasing speed.
Update: found it at https://bitbucket.org/pypy/compatibility/wiki/Home