> No, creating a binary blob is not a realistic option, and it's still not a Python/Java/C++ whatever solution.
Once again, "realistic" is subjective and I would say no "realistic" user will try to multiply arbitrarily-sized matrices in pure Python. (I can see small enough matrices, like 3x3 or 4x4, might be different.) And...
> So, it's not fair game in any honest benchmark. And even if using a C library is idiomatic Python, it still has no place in a language benchmark. It's a C library, not a Python implementation.
...you have correctly figured out that it's unfair for anyone using your (vague) definition of "realistic". It's also true that your proposal is also unfair for anyone using my definition of "idiomatic" however. I can try to defend my definition with quantative arguments, but I don't feel like doing so. In fact I tend to ignore most "language benchmarks" because it is virtually impossible to make them reasonably fair. This one is no exception.
Best "language benchmarks" tend to be more like language showcases with useful commentaries, there will be no single winner but you will get a good sense of pros and cons of each language-implementation-strategy combination. They are generally not advertised as "benchmarks", of course.