What isn't mentioned is why SQLite is using a virtual machine in the first place. The reason is that SQLite only calculates the next row of results when you ask - it does not calculate all result rows at once in advance. Each time you ask for the next row of results it has to resume from where it last left off and calculate that next row. The virtual machine is a way of saving state between those calls for the next row (amongst other things).
This is also why there isn't a method in SQLite DB adapters to get the number of result rows. SQLite has no idea other than actually calculating them all which is the same amount of effort as getting all of them.
They could have achieved the same effect by implementing it in a language with continuations or coroutines, or maybe just iterator functions. That might be viable today if the language is fast enough and callable from C.
C# and LuaJIT spring to mind.
Also, one of the goals of SQLite is to be available everywhere. It's available as super-standard-super-clean C code in one .c file, that works on just about every architecture. Had they used C# or LuaJIT, that could not be achieved.
I'm not sure if you are aware, but SQLite is what powers almost every app on Android or iPhone that needs tabular storage, including the built in ones like Contacts.
They did. They just implemented their own minilanguage instead, that (AFAIK) simply has no conventional syntax-based serialization. It's still a language, though. Given their tight focus, it is completely plausible that they can create a little VM that will run far, far better than any general-purpose language VM could.
If performance and portability are your goals, C is going to be the logical choice.
C# and LuaJIT are not options. I don't really understand what your point is, because they implemented their own VM instead.
The VM used by SQLite is different from the VM used by things like Javascript or Python in that SQL is not a general-purpose programming language. Most of the opcodes in the SQLite VM are heavy-weight concepts such as "insert a key/value pair into a b-tree" or "decode the N-th column of a table row". These opcodes are implemented with hundreds or thousands of lines of C code. The overhead of instruction dispatch is insignificant in comparison. And so there isn't much of a performance advantage one way or the other between a stack machine and a register machine in SQLite. The choice of a register machine is for maintainability, reliability, and testability.
I am no expert in this field, but AFAIK http://static.usenix.org/events/vee05/full_papers/p153-yunhe... (discussed on http://news.ycombinator.com/item?id=704576) has not really been refuted.
I base that on the fact that both Dalvik (http://davidehringer.com/software/android/The_Dalvik_Virtual...) and Lua 5.0 (http://www.lua.org/doc/jucs05.pdf) use register-based VMs.
"Berkeley DB is not exposed to the end-user. It is totally hidden below the SQLite APIs. It acts as the storage engine in place of SQLite's own BTREE. An application written to use the SQLite version 3 API can switch to Oracle Berkeley DB with no code changes, simply re-link against Berkeley DB."
http://www.oracle.com/technetwork/database/berkeleydb/overvi...
I haven't had time to read the full paper, but it looks very interesting.
I use SQLite frequently in small projects and it's absurdly invaluable. That it's free and PD almost beggars belief.