Unlike the Zef article, which describes implementation techniques, the Wren page also shows ways in which language design can contribute to performance.
In particular, Wren gives up dynamic object shapes, which enables copy-down inheritance and substantially simplifies (and hence accelerates) method lookup. Personally I think that’s a good trade-off - how often have you really needed to add a method to a class after construction?
See experience with Smalltalk and Self, where everything is dynamic dispatch, everything is an object, in a live image that can be monkey patched at any given second.
PyPy and GraalPy, and the oldie IronPython, are much better experiences than where CPython currently stands on.
The JIT would help everyone else more than removing the GIL, I wish PyPy became the reference implementation during 2.7
On the other side, having a type holding a closed set of applicable functions is somehow questioning.
There are languages out there that allows to define arbitrary functions and then use them as a methods with dot notation on any variable matching the type of the first argument, including Nim (with macros), Scala (with implicit classes and type classes), Kotlin (with extension functions) and Rust (with traits).
"Efficient implementation of the smalltalk-80 system"
https://benchmarksgame-team.pages.debian.net/benchmarksgame/...
Or its maintainability, and this is one of the big reasons why. Methods and variables are dynamically generated at runtime which makes it impossible to even grep for them. If you have a large Ruby codebase (say Gitlab or Asciidoctor), it can be almost impossible to trace through code unless you are familiar with the entire codebase.
Their "answer" is that you run the code and use the debugger, but that's clearly ridiculous.
So I would say dynamically defined classes is not only bad for performance; it's just bad in general.
A general rule of thumb is that if you can assign an expression a static type, then you can compile it fairly efficiently. Complex dynamic languages obviously actively fight this in numerous ways, and so end up being difficult to optimize. Seems obvious in retrospect.
The tradeoff is that this requires mutable AST nodes, which conflicts with the immutable-AST assumption most compilers rely on (e.g., for sharing subtrees or parallelizing compilation). For a single-threaded interpreter it works cleanly, but it'd be a problem if you wanted to JIT-compile from the same AST on a background thread while the interpreter is mutating nodes.
I’m basing that on the 1.6% improvement they got on speeding up sqrt. That surprised me, because, to get such an improvement, the benchmark must spend over 1.6% of its time in there, to start with.
Looking in the git repo, it seems that did happen in the nbody simulation (https://github.com/pizlonator/zef/blob/master/ScriptBench/nb...).
Basically the flow was:
- check if we’re calling a method of an object
- nope, ok, so cascade through 10+ symbol comparisons
- sqrt was towards the bottom of the cascade
I also like how, according to Github, the repo is 99.7% HTML and 0.3% C++. A testament to the interpreter's size, I guess?
But yeah the interpreter is very small
I didn't want any optimisation complexities and just focused on being able to understand my own Rust code. I was surprised by the performance I got simply by using my favourite language and as a bonus, since Rust takes care of all the ownership and lifetimes, I don't need a garbage collector. For sure, right now I'm being super conservative and rely on cloning stuff to avoid lifetime hell in stuff like closures, but the speed and memory profile is still very decent.
For anyone interested in a simple to understand tree-walking interpreter in Rust, which is heavily based in expressive enums where code is data, here's my interpreter:
> as a bonus, since Rust takes care of all the ownership > and lifetimes, I don't need a garbage collector.
I can imagine GluconScript's memory handling comes at a cost, even if the tradeoff of using a borrow checker is well worth it. Was that your experience?
Relatedly, since you commented there has been submission about garbage collectors in Rust ("Garbage Collection Without Unsafe Code"):
As far as I researched, only closures could generate dangling references and therefore need memory cleanup, but only if I allowed closures to access their environment (variables and functions) by reference / mutable reference.
To avoid this and simplify both my code as well as the mental model for the users of GluonScript, as of now, closures capture their environment by cloning it immutably. There's an increased memory usage with all the copying of the environment but there are never references to something that isn't being used anymore and therefore no need for a GC. At the end of the day all values captured by closures are owned Rust values that are dropped by Rust when no longer in scope.
So this can lead to high memory usage in hot loops but it can't lead to memory leaks.
I've gone through something similar, but for a more functional language (a Scheme). It's interesting how here the biggest wins are from optimizing the objects, while the biggest wins in my case were optimizing closures. The optimizations were very similar.
"Three implementation models for scheme" gives all the answers to make a fast enough scheme, though it has something of a compilation step, so it's not interpreting the original AST.
And the fact that having outline calls to methods of value objects is so expensive
Is this tied to unions? Or otherwise, when does this happen? I don't see the connection w/ invisicaps or &c
It was materially useful in this project.
- Caught multiple memory safety issues in a nice deterministic way, so designing the object model was easier than it would have been otherwise.
- C++ with accurate GC is a really great programming model. I feel like it speeds me up by 1.5x relative to normal C++, and maybe like 1.2x relative to other GC’d languages (because C++’s APIs are so rich and the lambdas/templates and class system is so mature).
But I’m biased in multiple ways
- I made Fil-C++
- I’ve been programming in C++ for like 35ish years now
> happen to know C++ really well
That’s my bias yeah. But C++ is good for more than just perf. If you need access to low level APIs, or libraries that happen to be exposed as C/C++ API, or you need good support for dynamic linking and separate compilation - then C++ (or C) are a great choice
There are many runtimes that I could have included but didn’t.
Also, it’s quite impressive how much faster PUC Lua is than QuickJS and Python
(I suppose the quick in QuickJS means "quick for a pure interpreter without JIT compilation or something...)
So like that’s wild
Python's execution time is mostly spent looking up stuff. I don't think lua is quite as dynamic.
That’s where for example getter inference happens.