Exploring the Virtual Database Engine inside SQLite (opens in new tab)

(coderweekly.com)

70 pointsmotter13y ago19 comments

19 comments

15 comments · 5 top-level

rogerbinns13y ago· 8 in thread

The big change hinted at in the documentation around the 3.5 era is that SQLite switched from a stack based VM to a register based one. Note that the implementation details are not exposed to a developer using SQLite and there is a massive test suite so the change had no visible impact.

What isn't mentioned is why SQLite is using a virtual machine in the first place. The reason is that SQLite only calculates the next row of results when you ask - it does not calculate all result rows at once in advance. Each time you ask for the next row of results it has to resume from where it last left off and calculate that next row. The virtual machine is a way of saving state between those calls for the next row (amongst other things).

This is also why there isn't a method in SQLite DB adapters to get the number of result rows. SQLite has no idea other than actually calculating them all which is the same amount of effort as getting all of them.

finnw13y ago

>Each time you ask for the next row of results it has to resume from where it last left off and calculate that next row. The virtual machine is a way of saving state between those calls for the next row (amongst other things).

They could have achieved the same effect by implementing it in a language with continuations or coroutines, or maybe just iterator functions. That might be viable today if the language is fast enough and callable from C.

C# and LuaJIT spring to mind.

beagle313y ago

SQLite predates both; the first official release is from August 2000.

Also, one of the goals of SQLite is to be available everywhere. It's available as super-standard-super-clean C code in one .c file, that works on just about every architecture. Had they used C# or LuaJIT, that could not be achieved.

I'm not sure if you are aware, but SQLite is what powers almost every app on Android or iPhone that needs tabular storage, including the built in ones like Contacts.

jerf13y ago

"They could have achieved the same effect by implementing it in a language with continuations or coroutines"

They did. They just implemented their own minilanguage instead, that (AFAIK) simply has no conventional syntax-based serialization. It's still a language, though. Given their tight focus, it is completely plausible that they can create a little VM that will run far, far better than any general-purpose language VM could.

ominous_prime13y ago

But in the end, you're just trading performance for abstractions. There's not really a functional difference between writing the DB in a language running on a VM implemented in C; or writing a DB with a VM implemented in C.

If performance and portability are your goals, C is going to be the logical choice.

1 more reply

onetwothreefour13y ago

SQLite is used everywhere. Embedded devices, all sorts of different architectures, etc.

C# and LuaJIT are not options. I don't really understand what your point is, because they implemented their own VM instead.

1 more reply

shortlived13y ago

Thanks for the insight. Do you know the reasons why they switched from using stack to registers in their VM implementation?

SQLite13y ago

We found it is much easier to handle exceptions without risking stack leaks using a 3-address register machine. Code generation is also simplified in a register machine compared to a stack machine (at least in the case of a VM designed specifically to run SQL statement).

The VM used by SQLite is different from the VM used by things like Javascript or Python in that SQL is not a general-purpose programming language. Most of the opcodes in the SQLite VM are heavy-weight concepts such as "insert a key/value pair into a b-tree" or "decode the N-th column of a table row". These opcodes are implemented with hundreds or thousands of lines of C code. The overhead of instruction dispatch is insignificant in comparison. And so there isn't much of a performance advantage one way or the other between a stack machine and a register machine in SQLite. The choice of a register machine is for maintainability, reliability, and testability.

Someone13y ago

I do not know, but if I had to guess, it is for performance.

I am no expert in this field, but AFAIK http://static.usenix.org/events/vee05/full_papers/p153-yunhe... (discussed on http://news.ycombinator.com/item?id=704576) has not really been refuted.

I base that on the fact that both Dalvik (http://davidehringer.com/software/android/The_Dalvik_Virtual...) and Lua 5.0 (http://www.lua.org/doc/jucs05.pdf) use register-based VMs.

1 more reply

rabidsnail13y ago· 2 in thread

Neat! Is there a way to tell the backend "run this bytecode, please"? If there is sqlite would make a great test bed to try out new query languages. Or, inversely, you could write a distributed database whose wire protocol was sqlite bytecode. Or you could write code translator to let you run sqlite queries on hadoop.

motterOP13y ago

Take a look at Oracle Berkeley DB:

"Berkeley DB is not exposed to the end-user. It is totally hidden below the SQLite APIs. It acts as the storage engine in place of SQLite's own BTREE. An application written to use the SQLite version 3 API can switch to Oracle Berkeley DB with no code changes, simply re-link against Berkeley DB."

http://www.oracle.com/technetwork/database/berkeleydb/overvi...

I haven't had time to read the full paper, but it looks very interesting.

lvh13y ago

SQLite is even recommended[1] (granted, by the authors) as a testbed for SQL extensions. While an entirely new query language would obviously be a lot more work, I don't see why it wouldn't be an equally good choice.

[1]: https://www.sqlite.org/whentouse.html

bane13y ago

I wonder if anybody has dared guestimate how much money SQLite has provided to the world in terms of time saved not hacking out buggy serialization and indexing code for software projects and just using this amazing tool instead.

I use SQLite frequently in small projects and it's absurdly invaluable. That it's free and PD almost beggars belief.

euroclydon13y ago

Informative but short article. I'd love to read more about when it's time to create a VM for your program or system. If anyone has links to more articles on VMs in practice, especially a story chronicling a transition from a non-VM architecture to a VM-based one, please post them.

themckman13y ago

Now that's just one of those things I would have never guessed. Super interesting.

j / k navigate · click thread line to collapse

19 comments

15 comments · 5 top-level

rogerbinns13y ago· 8 in thread

finnw13y ago

C# and LuaJIT spring to mind.

beagle313y ago

SQLite predates both; the first official release is from August 2000.

I'm not sure if you are aware, but SQLite is what powers almost every app on Android or iPhone that needs tabular storage, including the built in ones like Contacts.

jerf13y ago

"They could have achieved the same effect by implementing it in a language with continuations or coroutines"

ominous_prime13y ago

If performance and portability are your goals, C is going to be the logical choice.

1 more reply

onetwothreefour13y ago

SQLite is used everywhere. Embedded devices, all sorts of different architectures, etc.

C# and LuaJIT are not options. I don't really understand what your point is, because they implemented their own VM instead.

1 more reply

shortlived13y ago

Thanks for the insight. Do you know the reasons why they switched from using stack to registers in their VM implementation?

SQLite13y ago

Someone13y ago

I do not know, but if I had to guess, it is for performance.

I am no expert in this field, but AFAIK http://static.usenix.org/events/vee05/full_papers/p153-yunhe... (discussed on http://news.ycombinator.com/item?id=704576) has not really been refuted.

I base that on the fact that both Dalvik (http://davidehringer.com/software/android/The_Dalvik_Virtual...) and Lua 5.0 (http://www.lua.org/doc/jucs05.pdf) use register-based VMs.

1 more reply

rabidsnail13y ago· 2 in thread

motterOP13y ago

Take a look at Oracle Berkeley DB:

http://www.oracle.com/technetwork/database/berkeleydb/overvi...

I haven't had time to read the full paper, but it looks very interesting.

lvh13y ago

[1]: https://www.sqlite.org/whentouse.html

bane13y ago

I use SQLite frequently in small projects and it's absurdly invaluable. That it's free and PD almost beggars belief.

euroclydon13y ago

themckman13y ago

Now that's just one of those things I would have never guessed. Super interesting.

j / k navigate · click thread line to collapse