Modern languages might do more than C to prevent programmers from writing buggy code, but if you already have bug-free code due to massive time, attention, and testing, and the rate of change is low (or zero), it doesn’t really matter what the language is. SQLIte could be assembly language for all it would matter.
This jives with a point that the Google Security Blog made last year: "The [memory safety] problem is overwhelmingly with new code...Code matures and gets safer with time."
https://security.googleblog.com/2024/09/eliminating-memory-s...
https://www.sqlite.org/cves.html
Note that although code matures the chances of C Human error bugs will never go to zero. We have some bad incidents like Heartbleed to show this.
Instead what I see _mostly_ is re-writes and proposed re-writes of existing software, often software that has no networking functions, and/or relatively small, easily audited software that IMHO poses little risk of memory-related bugs
This is concerning to me as an end-user who builds their software from source because the effects on compilation, e.g., increased resource requirements, increased interdependencies, increased program size, decreased compilation speed, are significant
That is, nobody perceives, say, "the silver searcher" as being some sort of nefarious plot to re-write grep, but they did with ripgrep, even though that's not what it is trying to do.
There are a few projects that are deliberately doing a re-write for reasons they consider important, but they're not "just because it's in Rust," they're things like "openssl has had memory safety issues, we should use tools that eliminate those" or "sudo is very important and has grown a lot of features and has had memory safety issues, we should consider cutting some scope and using tools that help eliminate the memory safety issues." But those number of projects are very small. Heck, even things like uutils were originally "I just want to learn Rust" projects and not some sort of "we must re-write the world in Rust."
Designing new software is orders of magnitude more difficult than iterating on existing software
Public domain. No "copyleft" license needed
Being written in a small, fast, "old and boring" language may be part of what makes SQLite apealing. Another (related) part may be the thoughtfulness and carefulness of its author, e.g., "time, attention and testing"
The former, i.e., the author's "time, attention and testing", may matter more than the later, i.e., the author's language choice
As suggested by the top comment, in effect the author's language choice, by itself, may not matter with respect to the issue of "safety". If true, then even an "unsafe language" may not reduce the "safety" of SQLite^1
djb's software is also public domain and written in an "unsafe language", mostly the same "old and boring" one as used to write SQLite. Like SQLite it is appealing to many people and is found in many places^2
1. But the thoughtlessness and carelessness of an author, no matter what language they choose, is still relevant to "safety". In sum, "safety" is partly a function of human effort, e.g., "time, attention and testing", not simply the result of a language choice. Perhaps "safe" and "unsafe" are adjectives that can apply to authors as well as languages
This is obviously not analogous to Rust evangelism that targets projects written in C
The author claims the program is a clone of ack; ack is written in a "safe" language
Few things in this life are novel. Regardless, when I wrote ripgrep, I was focusing on writing new and better software. I perceived several problems with similar tools at the time and set out to do something better.
Imagine if people actually listened to whinging like your comment. I'm glad I never have.
We will see. On the Rust side there is Turso which is pretty active.
So SQLite is still the bar
I guess we could use Rust and I might be wrong on this, but it seemed like it would be a lot of work to utilize it compared to just continuing with C and gradually incorprating Zig, and we certainly don't write bug free C.
I don’t get that. You had trouble optimizing Go, so you went with Python?
Memory bugs are often implicated in security issues, but other bugs are more likely to cause data loss, corruption, etc.
Dont get me wrong, despite not taking time to learn Rust at this time, I am aware that memory safety is the thing at this time. Yes... some software might do better being rewritten from C to Rust. However, there are other projects that have been worked on for years and years. Sqlite is an example of this. That quote above is 100%
Various GNU tools are being replaced. Dont just expect them to be 100% despite being "memory safe" -- they will have to fo through various tweaks to boost performance. With Rust likely (still) to go through some further changes, I am sure Linus will get frustrated at some points within the Linux Kernel. We shall see.
Point is - some things are just better left with their mature state. Though.. on the flip side, we are have to think about the younger generation. Will SQlite survive if it continues to use C? It's likely to be a language that each new generation will not bother, and focus on Rust, Zig.. or whatever comes out in the future.
I am just waiting for rewrite of Doom or Quake (I am sure they already exist if I can be bothered to search.. in Rust)
Not only was Apple was able to launch the Mac Classic with zero lines of C code, their Pascal dialect lives on in Delphi and Free Pascal.
As one example.
The output is a non-portable half-a-million LoC Go file for each platform.
[Cries in Ada]
This is the C/++ delusion - "if one puts enough effort, a [complex] memory unsafe program can be made memory safe"; the year after this page was published, the Magellan series of RCEs was released.
Keeping SQLite in C is certainly a valid design choice, but it's important to be aware of the practical implications of the language.
We don't have to have one implementation of a lightweight SQL database. You can go out right now and start your own implementation in Rust or C++ or Go or Lisp or whatever you like! You can even make compatible APIs for it so that it can be a drop-in replacement for SQLite! No one can stop you! You don't need permission!
But why would we want to throw away the perfectly good C implementation, and why would we expect the C experts who have been carefully maintaining SQLite for a quarter century to be the ones to learn a new language and start over?
Because a lot of language advocacy has degraded to telling others what you want them to do instead of showing by example what to do. The idea behind this is that language adoption is some kind of zero sum game. If you're developing project 'x' in language 'y' then you are by definition not developing it in language 'z'. This reduces the stature of language 'z' and the continued existence of project 'x' in spite of not being written in language 'z' makes people wonder if language 'z' is actually as much of a necessity as its proponents claim. And never mind the fact that if the decision in what language 'x' would be written were to be revisited by the authors of 'x' that not only language 'z' would be on the menu, but also languages 'd', 'l', 'j' and 'g'.
I agree to what I think you're saying which is that "sqlite" has, to some degree, become so ubiquitous that it's evolved beyond a single implementation.
We, of course, have sqlite the C library but there is also sqlite the database file format and there is no reason we can't have an sqlite implementation in golang (we already do) and one in pure rust too.
I imagine that in the future that will happen (pure rust implementation) and that perhaps at some point much further in the future, that may even become the dominant implementation.
There's also the Go-wrapped WASM build of the C sqlite[0] which is handy.
But think about all those karma points here and on Reddit, or GitHub stars!
The SQLite developers are actually open to the idea of rewriting SQLite in Rust, so they must see an advantage to it:
> All that said, it is possible that SQLite might one day be recoded in Rust. Recoding SQLite in Go is unlikely since Go hates assert(). But Rust is a possibility. Some preconditions that must occur before SQLite is recoded in Rust include: […] If you are a "rustacean" and feel that Rust already meets the preconditions listed above, and that SQLite should be recoded in Rust, then you are welcomed and encouraged to contact the SQLite developers privately and argue your case.
I think we like to fool ourselves that decisions like these are based on performance considerations or maintainability or whatever, but in reality they would be based on time to market and skill availability in the areas where the team is being built.
At the end of the day, SQLite is not being rewritten because the cost of doing so is not justifiable
Huh it's not everyday that I hear a genuinely new argument. Thanks for sharing.
This feels like chasing arbitrary 100% test coverage at the expense of safety. The code quality isn’t actually improved by omitting the checks even though it makes testing coverage go up.
I don't think I would (personally) ever be comfortable asserting that a code branch in the machine instructions emitted by a compiler can't ever be taken, no matter what, with 100% confidence, during a large fraction of situations in realistic application or library development, as to do so would require a type system powerful enough to express such an invariant, and in that case, surely the compiler would not emit the branch code in the first place.
One exception might be the presence of some external formal verification scheme which certifies that the branch code can't ever be executed, which is presumably what the article authors are gesturing towards in item D on their list of preconditions.
“What gets us into trouble is not what we don't know. It's what we know for sure that just ain't so.” — Mark Twain, https://www.goodreads.com/quotes/738123
If you then can come up a scenario where you need it. Well in fully tested code you do need to test it.
// pseudocode
if (i >= array_length) panic("index out of bounds")
that are never actually run if the code is correct? But (if I understand correctly) these are checks implicitly added by the compiler. So the objection amounts to questioning the correctness of this auto-generated code, and is predicated upon mistrusting the correctness of the compiler? But presumably the Rust compiler itself would have thorough tests that these kinds of checks work?Someone please correct me if I'm misunderstanding the argument.
Automatic array bounds checks can get hit by corrupted data. Thereby leading to a crash of exactly the kind that SQLite tries to avoid. With complete branch testing, they can guarantee that the test suite includes every kind of corruption that might hit an array bounds check, and guarantee that none of them panic. But if the compiler is inserting branches that are supposed to be inaccessible, you can't do complete branch testing. So now how do you know that you have tested every code branch that might be reached from corrupted data?
Furthermore those unused branches are there as footguns which are reachable with a cosmic ray bit flip, or a dodgy CPU. Which again undermines the principle of keeping running if at all possible.
This is a dubious statement. In Rust, the array indexing operator arr[i] is syntactic sugar for calling the function arr.index(i), and the implementation of this function on the standard library's array types is documented to perform a bounds-check assertion and access the element.
So the checks aren't really implicitly added -- you explicitly called a function that performs a bounds check. If you want different behavior, you can call a different, slightly-less-ergonomic indexing function, such as `get` (which returns an Option, making your code responsible for handling the failure case) or `get_unchecked` (which requires an unsafe block and exhibits UB if the index is out of bounds, like C).
I wouldn't put it that way. Usually when we say the compiler is "incorrect", we mean that it's generating code that breaks the observable behavior of some program. In that sense, adding extra checks that can't actually fail isn't a correctness issue; it's just an efficiency issue. I'd usually say the compiler is being "conservative" or "defensive". However, the "100% branch testing" strategy that we're talking about makes this more complicated, because this branch-that's-never-taken actually is observable, not to the program itself but to its test suite.
sure safety checks are added but
it's ignoring that many of such checks get reliably optimized away
worse it's a bit like saying "in case of a broken invariant I prefer arbitrary potential highly problematic behavior over clean aborts (or errors) because my test tooling is inadequate"
instead of saying "we haven't found adequate test tooling" for our use case
Why inadequate? Because technically test setups can use
1. fault injection to test such branches even if normally you would never hit them
2. for many of such tests (especially array bound checks) you can pretty reliably identify them and then remove them from your test coverage statistic
idk. what the tooling of rust wrt this is in 2025, but around the rust 1.0 times you mainly had C tooling you applied to rust so you had problems like that back then.
#ifdef CONTRACTS
if (i >= array_length) panic("index out of bounds")
#endifRust does not stop you from writing code that accesses out of bounds, at all. It just makes sure that there's an if that checks.
Certainly don't get me wrong, SQLite is one of the best and most thoroughly tested libraries out there. But this was an argument to have 4 arguments. That's because 2 of the arguments break down as "Those languages didn't exist when we first wrote SQLite and we aren't going to rewrite the whole library just because a new language came around."
Any language, including C, will emit or not emit instructions that are "invisible" to the author. For example, whenever the C compiler decides it can autovectorize a section of a function it'll be introducing a complicated set of SIMD instructions and new invisible branch tests. That can also happen if the C compiler decides to unroll a loop for whatever reason.
The entire point of compilers and their optimizations is to emit instructions which keep the semantic intent of higher level code. That includes excluding branches, adding new branches, or creating complex lookup tables if the compiler believes it'll make things faster.
Dr Hipp is completely correct in rejecting Rust for SQLite. Sqlite is already written and extremely well tested. Switching over to a new language now would almost certainly introduce new bugs that don't currently exist as it'd inevitably need to be changed to remain "safe".
0: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.get...
There already is an implicit "branch" on every array access in C, it's called an access violation.
Do they test for a segfault on every single array access in the code base? No? Then they don't really have 100% branch coverage, do they?
I think a lot of projects that claim to have 100% coverage are overselling their testing, but SQLite is in another category of thoroughness entirely.
A simple array access in C:
arr[i] = 123;
...can be thought of as being equivalent to: if (i >= array_length) UB();
else arr[i] = 123;
where the "UB" function can do literally anything. From the perspective of exhaustively testing and formally verifying software, I'd rather have the safe-language equivalent: if (i >= array_length) panic();
else arr[i] = 123;
...because at least I can reason about what happens if the supposedly-unreachable condition occurs.Dr. Hipp mentions that "Recoding SQLite in Go is unlikely since Go hates assert()", implying that SQLite makes use of assert statements to guard against unreachable conditions. Surely his testing infrastructure must have some way of exempting unreachable assert branches -- so why can't bounds checks (that do nothing but assert undefined behavior does not occur) be treated in the same way?
A more complex C program can have index range checking at a different place than the simple array access. The compiler's flow analysis isn't always able to confirm that the index is guaranteed to be checked. If it therefore adds a cautionary (and unneeded) range check, then this code branch can never be exercised, making the code no longer 100% branch tested.
you basically say if deeply unexpected things happen you prefer you program doing widely arbitrary and as such potentially dangerous things over it having a clean abort or proper error. ... that doesn't seem right
worse it's due to a lack of the used tooling and not a fundamental problem, not only can you test this branches (using fault injection) you also often (not always) can separate them from relevant branches when collecting the branch statistics
so the while argument misses the point (which is tooling is lacking, not extra checks for array bounds and similar)
lastly array bounds checking is probably the worst example they could have given as it
- often can be disabled/omitted in optimized builds
- is quite often optimized away
- has often quite low perf. overhead
- bound check branches are often very easy to identify, i.e. excluding them from a 100% branch testing statistic is viable
- out of bounds read/write are some of the most common cases of memory unsafety leading to security vulnerability (including full RCE cases)
SQLite isn't a program, it's a library used by many other programs. As such, aborting is not an option. It doesn't do "wildly arbitrary" things - it reports errors to the client application and takes it on faith that they will respond appropriately.
It's like seat belts.
E.g. what if we drive four blocks and then the case occurs when the seatbelt is needed need the seatbelt? Okay, we have an explicit test for that.
But we cannot test everything. We have not tested what happens if we drive four blocks, and then take a right turn, and hit something half a block later.
Screw it, just remove the seatbelts and not have this insane untested space whereby we are never sure whether the seat belt will work properly and prevent injury!
- Rust needs to mature a little more, stop changing so fast, and move further toward being old and boring.
- Rust needs to demonstrate that it can be used to create general-purpose libraries that are callable from all other programming languages.
- Rust needs to demonstrate that it can produce object code that works on obscure embedded devices, including devices that lack an operating system.
- Rust needs to pick up the necessary tooling that enables one to do 100% branch coverage testing of the compiled binaries.
- Rust needs a mechanism to recover gracefully from OOM errors.
- Rust needs to demonstrate that it can do the kinds of work that C does in SQLite without a significant speed penalty.
2. This has been demonstrated.
3. This one hinges on your definition of “obscure,” but the “without an operating system” bit is unambiguously demonstrated.
4. I am not an expert here, but given that you’re testing binaries, I’m not sure what is Rust specific. I know the Ferrocene folks have done some of this work, but I don’t know the current state of things.
5. Rust as a language does no allocation. This OOM behavior is the standard library, of which you’re not using in these embedded cases anyway. There, you’re free to do whatever you’d like, as it’s all just library code.
6. This also hinges on a lot of definitions, so it could be argued either way.
ironically if we look at how things play out in practice rust is far more suited as general purpose languages then C, to a point where I would argue C is only a general purpose language on technicality not on practical IRL basis
this is especially ridiculous when they argue C is the fasted general purpose language when that has proven to simply not hold up to larger IRL projects (i.e. not micro benchmarks)
C has terrible UX for generic code re-use and memory management, this often means that in IRL projects people don't write the fasted code. Wrt. memory management it's not rare to see unnecessary clones, as not doing so it to easy to lead to bugs. Wrt. data structures you write the code which is maintainable, robust and fast enough and sometimes add the 10th maximal simple reimplementation (or C macro or similar) of some data structure instead of using reusing some data structures people spend years of fine tuning.
When people switched a lot from C to C++ most general purpose projects got faster, not slower. And even for the C++ to Rust case it's not rare that companies end up with faster projects after the switch.
Both C++ and Rust also allow more optimization in general.
So C is only fastest in micro benchmarks after excluding stuff like fortran for not being general purpose while itself not really being used much anymore for general purpose projects...
Rust insists on its own package manager "rustup" and frowns on distro maintainers. When Rust is happy to just be packaged by the distro and rustup has gone away, then it will have matured to at least adolescence.
The current version of the Rust compiler definitely doesn't -- there's known issues like https://github.com/rust-lang/rust/issues/57893 -- but maybe there's some historical version from before the features that caused those problems were introduced.
Of course, two libraries that choose different no_std collection types can't communicate...but hey, we're comparing to C here.
I’d love to see rust be so stable that MSRV is an anachronism. I want it to be unthinkable you wouldn’t have any reason not to support Rust from forever ago because the feature set is so stable.
Like why defend C in 2025 when you only have to defend C in 2000 and then argue you have a old, stable, deeply tested, C code base which has no problem with anything like "commonly having memory safety issues" and is maintained by a small group of people very highly skilled in C.
Like that argument alone is all you need, a win, simple straight forward, hard to contest.
But most of the other arguments they list can be picked apart and are only half true.
I'd like to see you pick the other arguments apart
Not OP, And I'm not really arguing with the post, but this struck me as a really odd thing to include in the article. Of course nothing is going to be faster then C, because it compiles straight to machine code with no garbage collection. Literally any language that does the same will be the same speed but not faster, because there's no way to be faster. It's physically impossible.
A much better statement, and one inline with the rest of the article, would be that at the time C and C++ were really the only viable languages that gave them the performance they wanted, and C++ wouldn't have given them the interoperability they wanted. So their only choice was C.
> Safe languages usually want to abort if they encounter an out-of-memory (OOM) situation. SQLite is designed to recover gracefully from an OOM. It is unclear how this could be accomplished in the current crop of safe languages.
I don't think most Rust code written today has guardrails in the case of OOM. I don't think this disqualifies Rust for most things, because I happen to find the trade-off worth it compared to the things it does protect against that C doesn't, but I don't think it's a particularly controversial take that Rust still could use some ergonomic improvements around handling allocation failures. Right now, trying to create a Box or Vec can theoretically fail at runtime if no memory is available, and those failures aren't returned from the functions called to create them. Handling panics is something you can normally do in Rust, but if you're already OOM, things get complicated pretty fast.
I agree that in the long run it would probably make sense to have something like this in Rust when eventually the current maintainers aren't around, but I also don't think it makes much sense to criticize them for continuing to maintain the code that already exists.
"Why is SQLite coded in C and not Rust?" is a question, which immediately makes me want to ask "Why do you need SQLite coded in Rust?".
they have a blog hinting at some answers as to "why": https://turso.tech/blog/introducing-limbo-a-complete-rewrite...
SQLite is old, huge and known for its gigantic test coverage. There’s just so much to rewrite.
DuckDB is from 2019, so new enough to jump on the “rust is safe and fast”
Maybe autovectorization works, but can I just write a few ARM64 instructions on my Mac in Rust stable (notcexperimental/nightly) as I can do it in C/C++ by just including a few ARM specific header files?
https://news.ycombinator.com/item?id=28278859 - August 2021
https://news.ycombinator.com/item?id=16585120 - March 2018
The current doc no longer has any paragraphs about security, or even the word security once.
The 2021 edition of the doc contained this text which no longer appears: 'Safe languages are often touted for helping to prevent security vulnerabilities. True enough, but SQLite is not a particularly security-sensitive library. If an application is running untrusted and unverified SQL, then it already has much bigger security issues (SQL injection) that no "safe" language will fix.
It is true that applications sometimes import complete binary SQLite database files from untrusted sources, and such imports could present a possible attack vector. However, those code paths in SQLite are limited and are extremely well tested. And pre-validation routines are available to applications that want to read untrusted databases that can help detect possible attacks prior to use.'
https://web.archive.org/web/20210825025834/https%3A//www.sql...
The biggest gripe I have with a rewrite is... A lof of the time we rewrite for feature parity. Not the exact same thing. So you are kind ignoring/missing/forgetting all those edge cases and patches that were added along the way for so many niche or otherwise reasons.
This means broken software. Something which used to work before but not anymore. They'll have to encounter all of them again in the wild and fix it again.
Obviously if we are to rewrite an important piece of software like this, you'd emphasise more on all of these. But it's hard for me to comprehend whether it will be 100%.
But other than sqlite, think SDL. If it is to be rewritten. It's really hard for me to comprehend that it's negligible in effect. Am guessing horrible releases before it gets better. Users complaining for things that used work.
C is going to be there long after the next Rust is where my money is. And even if Rust is still present, there would be a new Rust then.
So why rewrite? Rewrites shouldn't be the default thinking no?
I am not Dr Hipp, and therefore I like run-time checks.
Also, does it use doubly linked lists or graphs at all? Those can, in a way, be safer in C since Rust makes you roll your own virtual pointer arena.
You can implement a linked list in Rust the same as you would in C using raw pointers and some unsafe code. In fact there is one in the standard library.
You can write a linked list the same way you would in C if you wish.
sure, it's an old library they had pretty much anything (not because they don't know what they are doing but because shit happens)
lets check CVEs of the last few years:
- CVE-2025-29088 type confusion
- CVE-2025-29087 out of bounds write
- CVE-2025-7458 integer overflow, possible in optimized rust but test builds check for it
- CVE-2025-6965 memory corruption, rust might not have helped
- CVE-2025-3277 integer overflow, rust might have helped
- CVE-2024-0232 use after free
- CVE-2023-36191 segmentation violation, unclear if rust would have helped
- CVE-2023-7104 buffer overflow
- CVE-2022-46908 validation logic error
- CVE-2022-35737 array bounds overflow
- CVE-2021-45346 memory leak
...
as you can see the majority of CVEs of sqlite are much less likely in rust (but a rust sqlite impl. likely would use unsafe, so not impossible)
as a side note there being so many CVEs in 2025 seem to be related to better some companies (e.g. Google) having done quite a bit of fuzz testing of SQLite
other takeaways:
- 100% branch coverage is nice, but doesn't guarantee memory soundness in C
- given how deeply people look for CVEs in SQLite the number of CVEs found is not at all as bad as it might look
but also one final question:
SQLite uses some of the best C programmers out there, only they merge anything to the code, it had very limited degree of change compared to a typical company project. And we still have memory vulnerabilities. How is anyone still arguing for C for new projects?
Yeah I essentially agree. I'm sure there are still plenty of good cases for C, depending on project size, experience of the engineers, integration with existing libraries, target platform, etc. But it definitely seems like Rust would be the better option in scenarios where there's not some a priori thing that strongly skews toward or forces C.
It has async I/O support on Linux with io_uring, vector support, BEGIN CONCURRENT for improved write throughput using multi-version concurrency control (MVCC), Encryption at rest, incremental computation using DBSP for incremental view maintenance and query subscriptions.
Time will tell, but this may well be the future of SQLite.
Also, this is a VC backed project. Everyone has to eat, but I suspect that Turso will not go out of its way to offer a Public Domain offering or 50 year support in the way that SQLite has.
The aim is to be compatible with sqlite, and a drop-in replacement for it, so I think it's fair use.
> Also, this is a VC backed project. Everyone has to eat, but I suspect that Turso will not go out of its way to offer a Public Domain offering or 50 year support in the way that SQLite has.
It's MIT license open-source. And unlike sqlite, encourages outside contribution. For this reason, I think it can "win".
SQLite is NOT being rewritten in Rust!
>>Turso Database is an in-process SQL database written in Rust, compatible with SQLite.
turdso is VC funded so will probably be defunct in 2 years
Compatible with SQLite. So it's another database?
Occasionally when working in Lua I'd write something low-level in C++, wrap it in C, and then call the C wrapper from Lua. It's extra boilerplate but damn is it nice to have a REPL for your C++ code.
Edit: Because someone else will say it - Rust binary artifacts _are_ kinda big by default. You can compile libstd from scratch on nightly (it's a couple flags) or you can amortize the cost by packing more functions into the same binary, but it is gonna have more fixed overhead than C or C++.
If I want a "C Library", I want a "C Library" and not some weird abomination that has been surgically grafted to libstdc++ or similar (but be careful of which version as they're not compatible and the name mangling changes and ...).
This isn't theoretical. It's such a pain that the C++ folks started resorting to header-only libraries just to sidestep the nightmare.
So you might think, but there is a committee actively undermining this, not to mention compiler people keeping things exciting also.
There is a dogged adherence to backward compatibility, so that you can't pretend C has not gone anywhere in thirty-five years, if you like --- provided you aren't invoking too much undefined behavior. (You can't as easily pretend that your compiler has not gone anywhere in 35 years with regard to things you are doing out of spec.)
So, the argument for keeping SQLite written in C is that it gives the user the choice to either:
- Build SQLite with Yolo-C, in which case you get excellent performance and lots of tooling. And it's boring in the way that SQLite devs like. But it's not "safe" in the sense of memory safe languages.
- Build SQLite with Fil-C, in which case you get worse (but still quite good) performance and memory safety that exceeds what you'd get with a Rust/Go/Java/whatever rewrite.
Recompiling with Fil-C is safer than a rewrite into other memory safe languages because Fil-C is safe through all dependencies, including the syscall layer. Like, making a syscall in Rust means writing some unsafe code where you could screw up buffer sizes or whatnot, while making a syscall in Fil-C means going through the Fil-C runtime.
Sqlite has been recoded (automatically) in Go a while ago [1], and it is widely deployed
> would probably introduce far more bugs than would be fixed
It runs against the same test suite with no issues
> and it may also result in slower code
It is quite a lot slower, but it is still widely used as it turns out that the convenience of a native port outweighs the performance penalty in most cases.
I don't think SQLite should be rewritten in Go, Rust, Zig, Nim, Swift ... but ANSI C is a subset of the feature set of most modern programming languages. Projects such as this could be written and maintained in C indefinitely, and be automatically translated to other languages for the convenience of users in those languages
It runs against the same public test suite. The proprietary test suite is much more intensive.
It runs against the same test suite with no issues
- that proves nothing about bugs existing or not.
That doesn't guarantee no bugs. It just means that the existing behaviour covered by the tests is still the same. It may introduce new issues in untested edge cases or performance issues
Libraries written in C do not have a huge run-time dependency.
In its minimum configuration, SQLite requires only the following routines from the standard C library:
memcmp() memcpy() memmove() memset() strcmp() strlen() strncmp()
In a more complete build, SQLite also uses library routines like malloc() and free() and operating system interfaces for opening, reading, writing, and closing files. But even then, the number of dependencies is very small. Other "modern" languages, in contrast, often require multi-megabyte runtimes loaded with thousands and thousands of interfaces."
Very laudable!
(I should also point out that SQLite could conceivably be compiled with small (in terms of lines of code) C compilers like Fabrice Bellard's Tiny C Compiler (TCC). Also SQLite's few required standard C library routines listed above could conceivably be coded inside of SQLite itself(!) (they are, after all, just additional lines of C code in a different place -- and those could conceivably be moved or copied) -- thus removing the dependency/requirement for any standard C library whatsoever!)
Anyway, we love SQLite!
Safe languages insert additional machine branches to do things like verify that array accesses are in-bounds. In correct code, those branches are never taken. That means that the machine code cannot be 100% branch tested, which is an important component of SQLite's quality strategy.
Rust needs to mature a little more, stop changing so fast, and move further toward being old and boring.
Rust needs to demonstrate that it can do the kinds of work that C does in SQLite without a significant speed penalty.I suppose SQLite might use a C linter tool that can prove the bounds checks happen at a higher layer, and then elide redundant ones in lower layers, but... C compilers won't do that by default, they'll just write memory-unsafe machine code. Right?
This is annoying in Rust. To me array accesses aren't the most annoying, it's match{} branches that will never been invoked.
There is unreachable!() for such situations, and you would hope that:
if array_access_out_of_bounds { unreachable!(); }
is recognised by the Rust tooling and just ignored. That's effectively the same as SQLite is doing now by not doing the check. But it isn't ignored by the tooling: unreachable!() is reported as a missed line. Then there is the test code coverage including the standard output by default, and you have to use regex's on path names to remove it.Your example does what [] does already, it’s just a more verbose way of writing the same thing. It’s not the same behavior as sqlite.
https://algora.io/challenges/turso "Turso is rewriting SQLite in Rust ; Find a bug to win $1,000"
------
- Dec 10, 2024 : "Introducing Limbo: A complete rewrite of SQLite in Rust"
https://turso.tech/blog/introducing-limbo-a-complete-rewrite...
- Jan 21, 2025 - "We will rewrite SQLite. And we are going all-in"
https://turso.tech/blog/we-will-rewrite-sqlite-and-we-are-go...
- Project: https://github.com/tursodatabase/turso
Status: "Turso Database is currently under heavy development and is not ready for production use."
turso has 341 rust source files spread across tens of directories and 514 (!) external dependencies that produce (in release mode) 16 libraries and 7 binaries with tursodb at 48M and libturso_sqlite3.so at 36M.
looks roughly an order of magnitude larger to me. it would be interesting to understand the memory usage characteristics in real-world workloads. these numbers also sort of capture the character of the languages. for extreme portability and memory efficiency, probably hard to beat c and autotools though.
Talking about C99, or C++11, and then “oh you need the nightly build of rust” were juxtaposed in such a way that I never felt comfortable banging out “yum install rust” and giving it a go.
(There are some decent reasons to use the nightly toolchain in development even if you don’t rely on any unfinished features in your codebase, but that means they build on stable anyway just fine if you prefer.)
"....Safe languages insert additional machine branches to do things like verify that array accesses are in-bounds. In correct code, those branches are never taken. That means that the machine code cannot be 100% branch tested, which is an important component of SQLite's quality strategy..."
"...Safe languages usually want to abort if they encounter an out-of-memory (OOM) situation. SQLite is designed to recover gracefully from an OOM. It is unclear how this could be accomplished in the current crop of safe languages..."
One stupid workaround is combining multiple columns into one, with values separated by a space, for example. This works when each column value is always a string containing no spaces
Another stupid workaround, probably slower, might be to hash the multiple columns into a new column and use ON CONFLICT(newcolumn_name)
At this point I wish the creators of the language could talk about what rust is bad at.
Not only had this fellow built a functional ISP in one of the toughest markets (at that time), in the world - but he'd also managed to build the database engine and quite a few of the other tools that ran that ISP, and was in danger of setting a few standards for a few things which, since then, have long since settled out, but .. nevertheless .. it could've been.
Anyway, this fellow wrote everything in C. His web page, his TODO.h for the day .. he had C-based tools for managing his docs, for doing syncs between various systems under his command (often in very far-away locations, and even under water a couple times) .. everything, in C.
The database system he wrote in pure C was, at the time, quite a delight. It gave a few folks further up the road a bit of a tight neck.
He went on to do an OS, because of course he did.
Just sayin', SQLite devs aren't the only ones who got this right. ;)
Zig gives the programmer more control than Rust. I think this is one of the reasons why TigerBeetle is written in Zig.
More control over what exactly? Allocations? There is nothing Zig can do that Rust can’t.
I mean yeah, allocations. Allocations are always explicit. Which is not true in C++ or Rust.
Personally I don't think it's that big of a deal, but it's a thing and maybe some people care enough.
From section "1.2 Compatibility". How easy is it to embed a library written in Zig in, say, a small embedded system where you may not be using Zig for the rest of the work?
Also, since you're the submitter, why did you change the title? It's just "Why is SQLite Coded in C", you added the "and not Rust" part.
Any idea what this refers to? assert is a macro in C. Is the implication that OP wants the capability of testing conditions and then turning off the tests in a production release? If so, then I think the argument is more that go hates the idea of a preprocessor. Or have I misunderstood the point being made?
One reason I enjoy Go is because of the pragmatic stdlib. On most cases, I can get away without pulling in any 3p deps.
Now of course Go doesn’t work where you can’t tolerate GC pauses and need some sort of FFI. But because of the stdlib and faster compilation, Go somehow feels lighter than Rust.