The focus of the article really is about JNI in Rust.
I see most questions are about "Why did you not use X language instead?", so let me try and address this.
To answer the "Why not just Rust", I should first mention that Rust was still in its early days (before 1.0), and it was a risky bet to choose an emerging language.
The project was started by Vlad (our CEO) who had a background writing high performance Java in the trading space. The Zero-GC techniques - whilst uncommon in open source software - are mature and a staple of writing high performance code in the financial industry. The product evolved organically, feature after feature.
I personally joined the team from a C and C++ background, having previously moved from a project that suffered from minute-long compile times from single .cpp files due to template overuse. Whilst I do miss how expressive high-level C++ can be, Java has really good tooling support.
When writing systems-style software most of what matters in terms of performance is how we call system calls, manage memory and debug and profile. This is an area where Java really shines. Don't get me wrong: In the absolute sense I think C++ tools tend to be better (Linux Perf is awesome!), but Java tooling is _there_. IntelliJ makes it trivially easy to run a test under the debugger reliably and consistently. It's equally easy to run a profiler and to get code coverage. The same tools work across all platforms too, might I add. It's not necessarily better, but it's easier. Turns out that while a little quaint, using Java turned out to be a pretty good choice in my opinion in practice.
Times have moved on. The Rust community really cares about tooling, and it's one of the reasons why we've picked it over expanding our existing C++ codebase: We just want to get stuff done and have enough time left in our dev cycle to properly debug and profile our code.
Our open source database edition can also be used embedded though, so we can only upgrade at the pace of our customers and because of that we still are compatible all the way down to Java 8.
Were it not for this detail, we'd probably consider it a lot sooner.
The short of it is: Learn C. Learn your system calls. Learn JNI. Learn about com.sun.misc.Unsafe. Learn about the disruptor pattern. Learn how to pool objects. The long type can pack a lot of data. Go from there!
A little example. Yesterday I was making changes to our C++ client library and I wanted to improve the example in our documentation.
We use a dedicated protocol called ILP for streaming data ingestion and each of the inserted rows has a designated timestamp.
In the Rust example, I using added support for chrono::DateTime and it was trivially easy for me to add a timestamp for a specific example date and time: Utc.with_ymd_and_hms(1997, 7, 4, 4, 56, 55).
Our C++ library instead takes an std::chrono::time_point. I wanted to use the same datetime. As far as I can tell it requires first going through the old C "struct tm" type (which is local and not UTC), then converting to "time_t" then converting to utc via gmtime and then constructing a time_point from that. After 10 minutes the code got too long and complicated so I just substituted it a timestamp specified as an int64_t in nanoseconds.
Don't get me wrong, the C++ time_point is a work of art in how flexible it is, but unnecessarily complicated in most cases.
I should add that I also spent 45 minutes yesterday debugging a CMake issue.
Rust is not easy to learn, but it's just more modern and productive.
C++ is still great if you've got a massive team, but at our scale I don't think it makes any sense.
I've seen this "GC-less" Java in those use-cases quite a bit. From a conceptual design POV it's likely not the best approach, but there's a lot of sunk cost in that eco-system and a lot of trust and expertise where "Choosing a better language" is often several orders of magnitude more expensive.
Others have done the same. And as others have pointed out, there's things outside the DB domain in high frequency trading and the like that have done this as well.
There are advantages to Java: mature runtime, large talent pool out there, good tooling (still haven't seen anything as good as JMX for any other runtime). And if there's any language whose GC could be tuned to be "responsible", it'd be the JVM; there's been more GC R&D in the JVM than in any other runtime.
I worked at RelationalAI (another DB vendor) for a bit, and their DB is all written in Julia, another garbage collected language... and the GC in Julia is what I'd characterize as ... immature... for that kind of application. I would have loved to have access to the JVM's GC there.
Also this looks to be more of an analytical, column oriented, database. So I can imagine they're optimizing more for throughput than transactional latency. (I could be wrong, correct me, Quest folks...)
And choice of Java likely has to do with when they began working on the project and what was out there at the time. It's the real world of software eng. We work with the tools and people we have because shipping a product on time and bringing in $$ is more important than anything else. I don't know when they got started, but Rust has only matured to "mainstream" stability/acceptance in the last 2-3 years.
Finally, DBs often have a very layered architecture and theyt could easily compartmentalize pieces such that latency sensitive bits could be done in native Rust. They're not apparently doing this, but I could see them doing things like moving the page buffer or column indices or storage engine over to Rust over time for performance benefits.
All power to them, it's great to see them working with Rust. (aside: my email history looks like I spoke to a recruiter there at some point, maybe, but didn't interview? I think if I'd known they were playing with Rust I would have given that more attention...)
Yes that is the case
Also, Rust is a hard language to start a company with so I wouldn't be surprised if this is more of a product maturity thing.
Presumably you can't use Hotspot so you have to write your own VM too?
Might become less common now Rust is teaching the level it is.
It's fairly hard to write ergonomic interfaces for more complicated iteration patterns in Rust while still respecting safety. That's actually fine and by design, and it's possible with a lot of effort and thought but this is not as much of a concern in e.g. Java. E.g. skim the discussion on this proposed "cursor" API for Rust's stdlib BTree: https://github.com/rust-lang/rust/issues/107540
(And while Rust's enum-based algebraic types & pattern matching are nice, they're actually fairly limited when compared to what you can find in e.g. Scala or F#, Haskell, etc.)
But I think there's also huge win in doing something like the pager/buffer pool/storage/data structure/indexes layer in Rust. For safety and efficiency reasons.
The only thing they say that explains it is they end up with a single jar file, whose only dependency is the JRE.
So I guess they get platform independence and easy installation.
Just note that depending on the selection of the GC, this kind of usage may make the GC work more than when allocating new objects, not less. In particular, with the newer GCs -- G1 and ZGC -- mutating existing objects may be more costly than allocating new ones depending on circumstances. In general, the new GCs are optimised to work the least and give the best performance when the allocation rate is neither too high nor too low; the new GCs also reuse memory better than object pools. Reusing objects also precludes scalarization optimisations, i.e. not every `new Foo` actually results in a heap allocation, and can be optimised to work directly in registers.
So while on very old JVMs (such as Java 8) a "zero allocation" strategy may result in better performance and in "zero GC", on newer JVMs it may result in worse performance and more GC work (in fact, it will almost surely not yield zero GC). While it depends on many variables, I would advise against a zero allocation strategy on newer JVMs as the default path toward better performance or even better latency. It's an approach that seems to be very strongly coupled to the way the JVM was designed over a decade ago, but a lot has changed since then.
Additionally, Java now offers manual memory management and efficient FFI that are significantly better than what JNI offered: https://openjdk.org/jeps/442
> https://openjdk.org/jeps/442
This succinctly is the answer.
> not every `new Foo` actually results in a heap allocation, and can be optimised to work directly in registers.
While I know in my heart of hearts that you are right, this category of advice has so many caveats. It's like when Graal markets itself as a potential substitute for NodeJs, where its performance is abjectly terrible.
I have never personally succeeded in convincing anyone with these deep programming traditions to try something new that may be Better in Every Way. It always has to be this long, trickle down journey of adoption. People still talk about John Carmack and video game engineering, decades after any of his innovations have been verbatim relevant and decades after an individual could possibly have a hope and prayer of authoring a first person shooter with an audience from scratch. But see, I need to know a lot of stuff to understand that.
This is true of almost anything really, not just programming.
This is 2ish years old but has a lot of detail on what escape analysis can and cannot see:
https://gist.github.com/JohnTortugo/c2607821202634a6509ec3c3...
Here's another (related) caveat with taking the "minimal object creation", pooled memory, and/or using lots of non-GC native heap memory path: now you've got blocks of memory sitting there in Process RSS that the GC either can't do anything about, or (worse) knows nothing about.
These types of runtimes are on the whole designed with the philosophy that they Own All The Things, so it's like an invasive body driving the "immune system" nuts...
Never had this problem in Java (haven't worked in it for 10 years) but at a previous job (in Julia) I worked on a database buffer pool / pager where the memory was explicitly managed/allocated (through syscalls to anonymous mmap) for performance reasons. Those pages belong to the same PID as the broader Julia process, but were not managed by Julia runtime, and so the runtime can run into all sorts of issues with OOM kills or huge pauses as the system allocates aggressively without collecting often, thinking it has plenty of headroom... but doesn't.
I know for a fact the JVM GC is smarter than this, and can be tuned more expertly to manage these type of situations, but it's still a big giant caveat...
This a little bit surprising to me that low object creation can degrade GC performance; what's the failure mode for G1/ZGC in this scenario?
Doing more young generation collection is sometimes cheaper in aggregate (more frequent but far smaller and usually much more efficient) than adding more data to the older generations (less frequent and more costly, longer pauses, for object pools it happens on all of it even when none of it is currently used, etc).
But as doctorpangloss said: so many caveats. There's ample evidence that it is both better and worse, it depends on lots of details.
The main thing you can confidently claim is that it is not the majority of code, so most language optimizations will choose to improve straightforward and common stuff at the cost of this niche. Not always, but there is definitely more energy in improving the 90%+ cases and that adds up over time. Squeezing out the last bits of performance requires constant upkeep.
GC barriers (special code that gets triggered on some operations by some GCs, such as when mutating a reference field during a GC cycle -- that's a "write barrier").
Concurrent GCs have special rules for newly allocated objects (which are really just a pointer bump) because no one else has seen them yet; young objects require no GC barriers. But once an object is old, a concurrent GC needs to do some work to learn about references in the object changing. So while allocating a new object is usually a pointer bump (and sometimes not even that when the object is scalarized), mutating an old object triggers a GC slow-path that has to mark some data in a shared data structure (so we're talking memory ordering fences) to make sure that the mutated pointer is not overlooked by the GC.
OpenJDK's new GCs are really, very, very good. The new generational ZGC in JDK 21 (with sub-millisecond worst case pause) is just amazing. But these GCs are optimised for "reasonable" Java code and against "unreasonable" object pooling. Things are different from where they were a decade ago in Java 8.
It's true that our non-idiomatic Java usage denies us some of the benefits typically associated with Java programming. Automatic memory management and the old "Write Once, Run Anywhere" paradigm are difficult to maintain due to our reliance on native libraries and manual memory management.
I see two classes of reasons for choosing Java:
1. Historical: The QuestDB codebase predates Rust. According to Wikipedia, the initial Rust release was in 2015. The oldest commit in the QuestDB repo is from 2014: https://github.com/questdb/questdb/commit/95b8095427c4e2c781... What were the options back in 2014? C++? Too complicated. C? Too low-level. Pretty much anything else? Either too slow or too exotic.
2. Technical: Java, even without GC or WORA, still offers some advantage. 2a: The tooling is robust, especially when compared to C++. This starts with build systems (don't get me started on CMake!), and extends to aspects like observability. Stacktraces in Java are taken for granted. What's the state of stacktrace walking/printing in C++? I think it boils down to either Boost, C++23, or some other form of black magic. (I might be wrong here tho) 2b: It's a simpler language, especially when compared to C++ or even Rust. This makes it easier to hire people and also attracts external contributors: https://github.com/questdb/questdb/graphs/contributors 2c: The HotSpot JIT still provides us with solid peak performance without having to mess with PGO, etc. 2d: Concurrency is easier with Java's managed memory, eliminating the need for hazard pointers and the like.
Note Discord themselves has used Rust for bottlenecks with Erlang/Elixir/Beam[1].
[0]: https://questdb.io/customers/
[1]: https://discord.com/blog/using-rust-to-scale-elixir-for-11-m...
In some sense it reads a lot like they just needed an excuse to use Rust. Not that Rust is a bad choice for the areas they list as possible candidates for non-java code, it's just a really odd choice for Java shop, but then again, so is fighting the GC.
It is incredible fascinating work though.
InfluxDB 1.x had a chance to be great: the wrote a time series database that used a SQL-like dialect, they had a really nice alerting platform, they had a really nice visiualization tool. Then they abandoned all of that to jump on the hype train of "Write a new programming language!" which was the hot thing like 5 or 6 hype cycles ago. We've _never_ upgraded to their 2.x product because it literally threw away our investment.
I think if you guys get pick up where they departed you'll be tremendously successful.
https://docs.oracle.com/en/graalvm/jdk/21/docs/reference-man...
Still GraalVM could be an interesting solution as Rust seems to be supported: https://www.graalvm.org/latest/reference-manual/llvm/Compili...
https://github.com/oracle/graal/blob/master/substratevm/src/...
And in any case:
JEP draft: Prepare to Restrict The Use of JNI https://openjdk.org/jeps/8307341
JEP 442: Foreign Function & Memory API (Third Preview) https://openjdk.org/jeps/442
I do have one question around the assignment to `static mut CALL_STATE`. Don't you need some form of synchronization/memory fence/memory barrier to make sure that other threads see that assignment?
On x86/x64 it probably doesn't matter (total store order), but other architectures are less lenient.
https://github.com/questdb/rust-maven-plugin
Compared with JNA, JNI is indeed more complex, but it's faster and has more features. It also solves the problem of calling Java from Rust.
Have you guys benchmarked FFI in Java 21 (preview, now release) yet? :) Yes I know it's super new, but I'm curious if there is a benefit in terms of:
1. performance
2. ease of maintenance
3. ease of finding production problems