How to rewrite it in Rust (opens in new tab)

(adventures.michaelfbryan.com)

238 pointsFBT6y ago39 comments

39 comments

We did a similar thing with a Scala -> Rust rewrite for the http://prisma.io query engine.

By rewriting small components and integrating them into the existing project using Javas native interface, our small team of 5 developers were able to pull off this massive rewrite in just under a year. The resulting code base is rearchitected in a few very important ways, but mostly follows the same structure.

And because we kept and evolved our old Scala based test suite, we have a very high confidence in the rewrite.

When Async/.await finally landed, we could switch over very quickly, and it has been a joy to focus on benchmarks and performance over the last month. Spoiler: Rust is faster than Scala :-D

tombert6y ago

I promise that this is asked genuinely and isn't some sort of veiled "gotcha!" (it's tough to tell on the internet sometimes); what was the reason for a change from Scala to Rust?

I ask because Scala already has a good type system and the JVM typically has good performance nowadays, particularly with something like GraalVM, so I am actually really curious to why you felt a Rust rewrite was a good idea.

gameswithgo6y ago

Just some reason I might make a switch from Java/C# to Rust:

* you can keep memory use quite a bit lower * you can still sometimes get large constant factors of performance improvements over the JVM in some kinds of problem domains. If this means you can run on 1 server instead of say, 5, you have a much simpler infrastructure. * startup time, especially if you are doing 'serverless' or similar * tail latency - even a good GC language will have occasional long pauses. * data race protection at compile time * easier deployment - no need to install a jvm and keep it updated

2 more replies

sorenbs6y ago

I should add that the Rust community has been extraordinarily welcoming, and our existing Scala engineers were able to relatively quickly become proficient in Rust.

Huge shoutout to everyone working on Rust Async, especially the Berlin crew, who has been very helpful.

yann_s6y ago

It'd be great to hear from your experience about using Javas native interface. AFAIK, using native call can have a performance impact: https://en.wikipedia.org/wiki/Java_Native_Interface#Performa...

I guess the plan what to make native is pretty important. Maybe you want to disclose more details about that?

michael_j_ward6y ago

Do you have any write-ups on this? I might be looking to do this for a large java codebase soon.

Scarbutt6y ago

How many lines of code? (old/new)

pimeys6y ago

The Rust code is here: https://github.com/prisma/prisma-engine/

And the old Scala codebase here: https://github.com/prisma/prisma/

The old codebase has parts in Rust, so counting the lines is not that straightforward.

pornel6y ago

This ability to incrementally add Rust to a C codebase is very useful for adopting Rust in established projects.

You don't actually have to rewrite everything on day one. You can stop writing new C code right now, and then gradually replace old code only when you need to fix it or refactor it.

Sammi6y ago

Twice I've been part of a move from Javascript to Typescript that worked much the same. Both projects were large applications with several developers working on them. Both had been ongoing for a few years before the port started. In both projects we decided to write all new code in Typescript and convert JS to TS when we made any largish change to an existing JS file. In both cases it took around a year for us to hit > 90% all code being converted this way, and at that time we decided to actually make issues in our issue tracker for porting the rest, and then had the rest converted in a couple of months after that.

The big difference is however that JS and TS can live side by side on a file by file basis out of the box with the Typescript compiler, which makes it super easy to convert. You don't have this luxury with C and Rust of of the box, but serious kudos to the author for finding a way to do something very similar.

When converting C to Rust you usually have to do things on a module by module or compile artifact by compile artifact basis, which makes it much more challenging. You can however employ some sort of strangler pattern: https://docs.microsoft.com/en-us/azure/architecture/patterns...

zozbot2346y ago

> When converting C to Rust you have to do things on a module by module or compile artifact by compile artifact basis, which makes it much more challenging.

OP is essentially about proving the opposite. It does take a bit of setup to get there, but you can ultimately translate C to Rust on a function-by-function basis, and Rustify interfaces, data structures, etc. only gradually after nothing on the C side is relying on the older defs.

C++ would be more of a challenge - you need to forgo quite a few C++-exclusive features to end up with interfaces that Rust can work with. That's where an "artifact by artifact" approach might work better. Other languages would be roughly similar, with their heavyweight C FFI's.

2 more replies

tombert6y ago

I did something similar for a large-ish C# project and F#. The company I worked for was mostly an F# place, but we had a fairly large legacy codebase written in C#; typically we had the pattern of "if you had downtime or need to fix a bug in the C#, just rewrite the C# code into F#".

Annoyingly, at least with the typical msbuild pattern, you have to be using the same language at the project level, but you can have as many projects as you want per solution (it's weird). So it's not as seamless as the JS/TS system, but overall it's not too bad, since you still can mix and match somewhat.

1 more reply

dfox6y ago

Is there any way to do this the other way around and integrate new rust code into non-trivial C project with non-trivial build system?

steveklabnik6y ago

Yep, it sort of looks the same. This is how Rust entered Firefox, for example. The most straightforward way is to call `cargo build` from within whatever build system you're using for C.

saagarjha6y ago

> However, Rust has a killer feature when it comes to this sort of thing. It can call into C code with no overhead (i.e. the runtime doesn’t need to inject automatic marshalling like C#’s P/Invoke) and it can expose functions which can be consumed by C just like any other C function.

As we see below, you may still need to write some code to convert from C types to something that is more ergonomic to use in Rust. But the marshaling ABI-wise is minimal.

> Turns out the original tinyvm will crash for some reason when you have multiple layers of includes

It's actually crashing because there are no lines of code in the file, so certain data structures never get initialized.

rvz6y ago

> As we see below, you may still need to write some code to convert from C types to something that is more ergonomic to use in Rust. But the marshaling ABI-wise is minimal.

Exactly, as much as I love writing Rust code, there is nothing that is more frustrating that maintaining bindings from C to Rust. Then you'd have to create another idiomatic binding on top of it. Sure bindgen is great, but Zig's cImport and Swift's ClangImporter take this further.

They both use clang modules to tackle the first part. A autogenerated idiomatic solution would be great for Rust, but sadly it doesn't exist.

paulddraper6y ago

At some point, I figure you will always need human intervention to get actually idiomatic interop, whether it's C marshalling, JSON serialization, database persistence, etc.

ridiculous_fish6y ago

> This is where build scripts come in. Our strategy will be for the Rust crate to use a build.rs build script and the cc crate to invoke the equivalent commands to our make invocation

Yikes - port the entire build system to cargo before you write a line of Rust. Now draw the reset of the owl!

Surely's there's an incremental path for the build as well? Perhaps if you're using CMake?

pornel6y ago

You can build Rust code as a static library and make the C build system consume that instead. This is the approach that librsvg took.

Moving C build to build.rs is not necessary. It's usually done only because people used to Cargo don't like bringing CMake along. If you were to publish this as a Rust crate, it'd be slightly easier for downstream Rust users to have one less external tool to install.

Rusky6y ago

There's a `cmake` crate that you could use from `build.rs` the same way, or you could go the other direction and have your existing build system invoke Cargo or rustc.

staticassertion6y ago

Great guide, this looked surprisingly straightforward.

wheaties6y ago

I came here to say the same thing. This is a great article on "Rust is safe for our org."

bfrog6y ago

This seems exactly what remacs is doing with emacs.

j / k navigate · click thread line to collapse

39 comments

sorenbs6y ago

We did a similar thing with a Scala -> Rust rewrite for the http://prisma.io query engine.

And because we kept and evolved our old Scala based test suite, we have a very high confidence in the rewrite.

When Async/.await finally landed, we could switch over very quickly, and it has been a joy to focus on benchmarks and performance over the last month. Spoiler: Rust is faster than Scala :-D

tombert6y ago

I promise that this is asked genuinely and isn't some sort of veiled "gotcha!" (it's tough to tell on the internet sometimes); what was the reason for a change from Scala to Rust?

gameswithgo6y ago

Just some reason I might make a switch from Java/C# to Rust:

2 more replies

sorenbs6y ago

I should add that the Rust community has been extraordinarily welcoming, and our existing Scala engineers were able to relatively quickly become proficient in Rust.

Huge shoutout to everyone working on Rust Async, especially the Berlin crew, who has been very helpful.

yann_s6y ago

It'd be great to hear from your experience about using Javas native interface. AFAIK, using native call can have a performance impact: https://en.wikipedia.org/wiki/Java_Native_Interface#Performa...

I guess the plan what to make native is pretty important. Maybe you want to disclose more details about that?

michael_j_ward6y ago

Do you have any write-ups on this? I might be looking to do this for a large java codebase soon.

Scarbutt6y ago

How many lines of code? (old/new)

pimeys6y ago

The Rust code is here: https://github.com/prisma/prisma-engine/

And the old Scala codebase here: https://github.com/prisma/prisma/

The old codebase has parts in Rust, so counting the lines is not that straightforward.

pornel6y ago

This ability to incrementally add Rust to a C codebase is very useful for adopting Rust in established projects.

You don't actually have to rewrite everything on day one. You can stop writing new C code right now, and then gradually replace old code only when you need to fix it or refactor it.

Sammi6y ago

zozbot2346y ago

> When converting C to Rust you have to do things on a module by module or compile artifact by compile artifact basis, which makes it much more challenging.

2 more replies

tombert6y ago

1 more reply

dfox6y ago

Is there any way to do this the other way around and integrate new rust code into non-trivial C project with non-trivial build system?

steveklabnik6y ago

Yep, it sort of looks the same. This is how Rust entered Firefox, for example. The most straightforward way is to call `cargo build` from within whatever build system you're using for C.

saagarjha6y ago

As we see below, you may still need to write some code to convert from C types to something that is more ergonomic to use in Rust. But the marshaling ABI-wise is minimal.

> Turns out the original tinyvm will crash for some reason when you have multiple layers of includes

It's actually crashing because there are no lines of code in the file, so certain data structures never get initialized.

rvz6y ago

> As we see below, you may still need to write some code to convert from C types to something that is more ergonomic to use in Rust. But the marshaling ABI-wise is minimal.

They both use clang modules to tackle the first part. A autogenerated idiomatic solution would be great for Rust, but sadly it doesn't exist.

paulddraper6y ago

At some point, I figure you will always need human intervention to get actually idiomatic interop, whether it's C marshalling, JSON serialization, database persistence, etc.

ridiculous_fish6y ago

> This is where build scripts come in. Our strategy will be for the Rust crate to use a build.rs build script and the cc crate to invoke the equivalent commands to our make invocation

Yikes - port the entire build system to cargo before you write a line of Rust. Now draw the reset of the owl!

Surely's there's an incremental path for the build as well? Perhaps if you're using CMake?

pornel6y ago

You can build Rust code as a static library and make the C build system consume that instead. This is the approach that librsvg took.

Rusky6y ago

There's a `cmake` crate that you could use from `build.rs` the same way, or you could go the other direction and have your existing build system invoke Cargo or rustc.

staticassertion6y ago

Great guide, this looked surprisingly straightforward.

wheaties6y ago

I came here to say the same thing. This is a great article on "Rust is safe for our org."

bfrog6y ago

This seems exactly what remacs is doing with emacs.

j / k navigate · click thread line to collapse