Also, as stated in the article, on Apple silicon Wasmi currently performs kinda poorly but this will be improved in the future. On AMD server chips Wasmi is the fastest Wasm interpreter.
https://wasmi-labs.github.io/blog/posts/wasmi-v0.32/benches/...
I think smart contract execution is a good application of WebAssembly. It seems promising!
> It is an excellent choice for plugin systems, cloud hosts and as smart contract execution engine.
Would also be nice for a more dynamic sandboxed code running on mini home servers as something like https://sandstorm.io/
Only issue is, not all languages that compile to WASM are deterministic.
I assume that for very short-lived programs, a stack based interpreter could be faster. And for long-lived programs, a JIT would be better.
This new IR seems to be targeting a sweet spot of short, but not super short lived programs. Also, it's a great alternative for environments where JIT is not possible.
I'm happy to see all these alternatives in the Wasm world! It's really cool. And thanks for sharing!
Even faster startup times can be achieved by so-called in-place interpreters that do not even translate the Wasm binary at all and instead directly execute it without adjustments. Obviously this is slower at execution compared to re-writing interpreters.
Examples for Wasm in-place interpreters are toywasm (https://github.com/yamt/toywasm) or WAMR's classic interpreter.
I wrote a paper about Wizard's in-place interpreter and benchmarked lots of runtimes in (https://dl.acm.org/doi/abs/10.1145/3563311).
As there seem to be even more runtimes popping up (great job with wasmi, btw), it seems like a fun, maybe even full-time, job to keep up with them all.
The abundance of Wasm runtimes is a testimony of how great the WebAssembly standard really is!
The non-WASI test cases are only for testing translation performance, thus their imports are not necessary to be satisfied. This would have been the case if the benchmarks tested instantiation performance instead. Usually instantiation is pretty fast though for most Wasm runtimes compared to translation time.
https://github.com/composablesys/wish-you-were-fast/tree/mas...
They run on nearly all engines.
I love all the improvements that Wasmi has been doing lately. Being so close to the super-optimal interpreter Stitch (a new interpreter similar to Wasm3, but made in Rust) is quite impressive.
As a side note, I wish most of the runtimes in Rust stopped adopting the "linker" paradigm for imports, as is a completely unnecessary abstraction when setting up imports is a close-to-zero cost
when using lazy-unchecked translation with relatively small programs, setting up the Linker sometimes can take up the majority of the overall execution with ~50 host functions (which is a common average number). We are talking about microseconds, but microseconds come to play an importance at these scales. This is why for Wasmi we implemented the LinkerBuilder for a 120x speed-up. :)
[1] https://github.com/wasmi-labs/wasmi/blob/master/crates/wasmi...
How does that correspond to implementing the wasi api? I think wasmtime is a javascript project for wasm on the web, is that a distinct thing to wasi?
(this file existing suggests wasmi contains an implementation of wasi https://github.com/wasmi-labs/wasmi/blob/master/crates/wasi/...)
But yes, Wasmi also supports WASI preview1 and can execute Wasm applications that have been compiled in compliance to WASI preview1.
Though I have to say that the "list of addresses" approach is not optimal in Rust today since Rust is missing explicit tail calls. Stitch applies some tricks to achieve tail calls in Rust but this has some drawbacks that are discussed in detail at Stitch's README.
Furthermore the "list of addresses" (or also known as threaded code dispatch) has some variance. From what I know both Wasm3 and Stitch use direct threaded code which stores a list of function pointers to instruction handlers and use tail calls or computed-goto to fetch the next instruction. The downside compared to bytecode is that direct threaded code uses more memory and also it is only faster when coupled with computed-goto or tail calls. Otherwise compilers nowadawys are pretty solid in their optimizations for loop-switch constructs and could technically even generate computed-goto-like code.
Thus, due to the lower memory usage, the downsides of using tail calls in Rust and the potential of compiler optimizations with loop-switch constructs we went for the bytecode approach in Wasmi.