I rewrote a Clojure tool in Rust (opens in new tab)

(timofreiberg.github.io)

169 pointspraveenperera5y ago76 comments

76 comments

40 comments · 13 top-level

chrisulloa5y ago· 10 in thread

While I definitely agree Rust is a much faster language than Clojure, I would be interested to see benchmarks on your code that show just how much faster your Rust code was on the same data.

I also noticed that you mentioned avoiding lazy sequences is not idiomatic in Clojure. I disagree with this since using transducers is still idiomatic. I wonder if you could've noticed some speed improvements moving your filters/maps to transducers. Though I doubt this would get you to Rust speeds anyway, it might just be fast enough.

j-pb5y ago

I just moved a medium sized codebase from clojure transducers to JS, and after having used clojure for 7+ years, and done so professionally, I don't wanna go back, ever. The JS solution is shorter, faster, and easier to understand. I'm thankfull for the insights into reality and programming clojure has provided, but highly optimised clojure is neither idiomatic nor pretty, you end up with eductions everywhere. Combine that with reaaaallllyy bad debuggability with all those nested inside out transducer calls (the stack traces have also gotten worse over the years, I don't know why, and a splintered ecosystem (lein, boot, clj-tools)) I'd pick rust and deno/js any day for a greenfield project over clojure. sadly.

divs12105y ago

Yup, it's like the leadership is actively hostile towards community building.

* Prismatic Schema, immensely popular, was "replaced" by spec, which is not yet complete and still in the research phase

* leiningen (one of the best language tooling out there) was "replaced" by Clojure CLI that can't do half of what leiningen can

* transducers (a brilliant concept) are not easy (as in close at hand) because the code is quite different to normal lazy-sequence based code (I wrote a library [1] to address this)

I still prefer Clojure for all my side projects, but it is very clear that the community is tiny and fragmented.

[1] https://github.com/divs1210/streamer

1 more reply

kostadin5y ago

I have limited exposure to Clojure transducers but I spend most of my time writing JS/TS and I've found thi.ng/transducers[0] a pleasure to work with and super elegant for constructing data processing workflows.

[0] https://github.com/thi-ng/umbrella/tree/develop/packages/tra...

nbardy5y ago

Debugging performance problems is the reason I stopped using `cljs`. Those stack traces are so painful.

ithrow5y ago

That's a very common scenario for Clojure users to go through and it's one of the reason it has so many abandoned/unfinished libraries (although 7 years was a lot). After they have gather all the insights they can from Clojure and its ecosystem (which is a worthy endeavor IMO), they go back to their big ecosystem mainstream programming language because of all the benefits you get from it even if that programming language is worse. It also doesn't help Clojure the fact that JS 2020 is way better than JS in 2010 and that you can easily bring all your Clojure insights/concepts to JS.

3 more replies

beders5y ago

> reaaaallllyy bad debuggability

Odd, I can just step through my code with Cursive if I need to.

1 more reply

bjoli5y ago

I doubt it will bring much. If properly implemented, there is nothing that makes generator-like laziness slower than transducers, and since it is pretty central to clojure I doubt you will see much speed gain by using transducers.

In scheme, the srfi-158 based generators are slower than my own transducer SRFI (srfi-171) only in schemes where set! incurs a boxing penalty and where the eagerness of transducers means mutable state can be avoided.

Now, I know very little clojure, but I doubt they would leave such a relatively trivial optimization on the table. A step in a transducer is just a procedure call, which is the same for trivial generator-based laziness.

lilactown5y ago

When I was reading the article, I thought the author of the post was probably pointing more in the direction of Clojure's immutable data being slower, rather than laziness specifically.

IME (admittedly in a different context, doing UI development) Clojure's seqs and other immutable data can be a huge performance drag due to additional allocations needed. If you're in a hot loop where you're creating and realizing a sequence immediately, it's probably much faster to bang on a transient vector. Same with creating a bunch of immutable hash maps that you then throw away; better to create a simpler data structure (e.g. a POJO or Map) which doesn't handle complicated structural sharing patterns if it's just going to be thrown away.

Transducer's would help in the author's first case to take the map/filter piped through the `->>`, which is going to do two separate passes and realize two seqs, and combine it into one.

1 more reply

adamnemecek5y ago

I mean Clojure is still running on JVM so there will be at least that difference. AFAIK Clojure is slower than Java so there's that also.

1 more reply

dgb235y ago

Transducers are definitely idiomatic. They are more general over "similar things to transform in steps" (including sequences, messages and so on), so you can apply them to collections ("I have the whole data in advance") or channels ("I get the data piece by piece") and so on.

Another idiomatic way to improve performance are transients[0]. From the outside your function is still a function, but on the inside it's cheating by updating in place instead of using persistent data structures. See the frequencies function for a simple example[1].

Clojure and Rust are both very expressive languages and even though they both can be considered niche, they have _massive_ reach: Clojure taps into JVM and the JS ecosystems, Rust can also compile to WASM or be integrated with the JVM via JNI.

The big difference between the two, and why I think they complement each other nicely, is that Clojure is optimized for development, and does its best at runtime, but Rust is optimized for runtime, and tries its best at development. (A similar take in the article). In other words: they both achieve their secondary goal well, but resolve trade-offs by adhering to their primary in the vast majority of cases.

[0] https://clojure.org/reference/transients

[1] https://github.com/clojure/clojure/blob/clojure-1.10.1/src/c...

didibus5y ago· 5 in thread

A bit sad there was no profiling done, or at least the article doesn't mention it. Maybe optimizing Clojure wouldn't have been that hard, could have been only a few places needed tweeking. In any case, Rust is obviously targeting high performance in a way Clojure isn't. Rust is faster than Java, and Clojure can only ever match Java in performance, not exceed it. But still, it's not clear if the author tried to optimize the Clojure version or not?

fulafel5y ago

> Rust is faster than Java, and Clojure can only ever match Java in performance, not exceed it.

The Rust vs Java question translates to the age old C++ vs Java argument, where the counterpoint is that Java can be faster because JVM has no significant disadvantage in code generation but JIT and GC can be faster than AOT and malloc, and then there are many back and forth arguments and nobody changes their mind.

In another sense, ease of use and HLL properties of languages can in practice give performance advantages. Given the same amount of time, the programmer of a more expressive high level language might have more time to iterate and to do algorithm work that end up being much bigger effects than the relatively small differences of compiler code generation.

(The word performance of course also has meanings other than code execution speed...)

jcelerier5y ago

> and then there are many back and forth arguments and nobody changes their mind.

except that people routinely rewrite java code in C++ in 2020 and run around the Java code in circles, even when tuning GCs etc etc, à la https://www.scylladb.com/2020/10/06/c-scylla-in-battle-royal... or Minecraft Bedrock edition (C++) vs original Java Minecraft

How many times have there been rewrites from C++ to Java that ended up being faster ?

2 more replies

dgb235y ago

> Rust is faster than ...

Not if you use the wrong constructs, copy stuff around in the heap, use ref counting everywhere in longer running processes.

I'm not nitpicking here, in Rust you can get really fast, but its on you to make that happen.

For example persistent data-structures (used in Clojure) are really fast and for some operations and cases even close to optimal.

Performance is hard, and I very much agree with your question here. What has been measured and what are the results.

didibus5y ago

I agree with you, but I think the assumption here is that we're comparing two code bases that are both trying to be performant. The Rust defaults will probably start off more performant, and the ceiling will be higher as well.

A lot of the performance comes from the different paradigms though, so it's not always an apple to apple comparison. But I also think that's an assumption being made when talking about a Clojure Vs Rust implementation. In the latter, you're most likely implying using mutable collections, fixed size structs, primitive types, and a tighter memory allocation surface. And not surprisingly, those are the same changes you'd make to your Clojure code base to speed it up (most likely).

1 more reply

burnthrow5y ago

> Clojure can only ever match Java in performance

You're technically correct, but the typical Java program making heavy use of threads has inefficiencies (and incorrectness) that would be avoided with Clojure's higher level async APIs. As it's easier to write idiomatic, performant C than the "faster" ASM.

systems5y ago· 3 in thread

i like how in the end the system was replaced by a database solution

every language need easy access to a query-ble database many problems are a lot simple when solved declaratively as a query-ble database

the relational model, is functional, and is a very good solution to a wide range of problems

i think the Sqlite engine should be integrated in the standard library of every language, and either use sql, or the language can provide a native sql alternative in the original language itself, or we can create a new standard language (because yes, sql can be improved upon)

I think Chris Date D language can be a place to start to investigate SQL alternative , or as a language that can be more easily emulated in other languages

secondcoming5y ago

'queryable' is a word!

wwweston5y ago

What would you say the advantages of Date's D over SQL are?

systems5y ago

This is from memory, and i saw few examples but what i recall the syntax was less dsl-ish SQL is a DSL, it is not a general purpose language

Chris Dates' D ( not to be confused with Walter Bright D) looks more like a regular programing language

And was functional

So I think an language can add a library or an extension that act like D, or simulate D

Check this to get an idea https://reldb.org/c/wp-content/uploads/2017/12/Rel-and-Tutor...

Also, as I recall, the Notion of RelVars made it sound functional

Each Relational Operator, returned a RelVar So you were passing RelVars between operators until you get the result you want, which was a RelVar

Anyway this is from memory, so I maybe be wrong on many things .. but again from memory D or tutorial D sounded a lot less DSL-ish and lot more functional than SQL , which was an improvement in my mind

1 more reply

ithrow5y ago· 2 in thread

The articles makes a good a case of how Rust can look as clean as other high-level programming languages when writing high-level business logic.

dgb235y ago

I agree that Rust has some really great concepts. The example code makes heavy use of traits for example, which are very ergonomic and provide a dynamic feel.

The context is a rewrite AKA runtime optimization. So the result is already understood. A great use-case for Rust is top-down implementation.

Also the code doesn't show any of the more painful cases. From the article:

> There are some inefficiencies visible here, and they're probably the most important spots for performance improvements. But they're still there as fixing them was too hard and/or time-consuming for me.

Resolving these "inefficiencies" is where Rust really shines. Because it _can_ resolve them and does it internally consistent on top of it. But at the same time, this is where you really _slow down_ in development and need to think about the more complex and intricate concepts such as lifetimes and borrow semantics.

brabel5y ago

As someone who has been writing a lot of Rust, I think that's only apparent... there's lots of things you can't do in Rust that you can in a GC-based language. The restrictions Rust imposes on you (mostly about the borrow checker, especially when you have mutable values) makes it much harder to write code... if you make it easier by cloning happily, you might end up with worse performance than in the GC-based language. If you don't, I guarantee your code won't look pretty with all those lifetime annotations.

burnthrow5y ago· 2 in thread

> After switching to Rust, I had to implement more complicated logic that resolved dependencies between the rules I was diffing. This became complicated enough that even with a static type system I could only barely understand it. I doubt I would have been able to finish it in Clojure.

I have no idea what the author's trying to say here.

mattrepl5y ago

I believe they are saying the static type system and type checker provided by Rust helped them write complicated code.

burnthrow5y ago

Yes but it's not clear how. The post is light on details in general.

1 more reply

kristoft5y ago· 2 in thread

Nice post, very well written! Just not sure why compare non-gc low-level statically-typed language with vm-based gc dynamic language.

nindalf5y ago

It’s a fair comparison because many developers, including the author, would have to make a choice between languages, some of which would be GC and while others are non GC.

higerordermap5y ago

"for writing same tool" so..

SomeHacker445y ago· 1 in thread

I personally was quite astonished (not in a good way) by the use of #_"comments" instead of ;[;[;[;]]] comments.

tincholio5y ago

Also, the emojis in them...

tw254970505y ago· 1 in thread

Sometimes the architecture/algorithm matters, and sometimes the architecture/algorithm needs to align with the language. Absent seeing the broader code base [1], I'm inclined to think that the author's larger design led to these expensive functions existing as they did [2].

Pure speculation on my part, but if one has a lot of experience with imperative, mutable languages, one might design a system that ends up being not so great when written in a functional, immutable language. If so, then seeing improvements when directly porting to an imperative, mutable language might be not so surprising.

Tangent: Regarding the power and importance of code structure, I highly recommend watching "Solving Problems the Clojure Way" by Rafal Dittwald at Clojure North 2019 [3].

[1] I didn't see a link, but if it's available, I'd love to take a look.

[2] The `rule-field-diff` function for example seems to be burdened with some odd choices, e.g., taking in two "rules" as arguments, (which seem to be collections of rules keyed by field), then using two hard-coded "operations" (also keyed by field), and yielding a map whose values are sequences by field (I think). Off the top of my head I don't see why this fn needs to work across multiple fields in the first place (i.e., any field-specific "loop" should be in a surrounding context. Ditto for `diff-rules-by-keys`.

[3] https://youtu.be/vK1DazRK_a0?t=461

blunte5y ago

That Rafal Dittwald video is excellent. It gives a small but illustrative comparison of procedural, oop, and finally functional... and using javascript (thereby making it accessible to non-lispers).

Most developers should watch it.

Narishma5y ago· 1 in thread

The code snippets in the article are unreadable due to low contrast.

kimi5y ago

Not sure about Rust, but the on the Clojure side, code does not seem pretty.

raspasov5y ago

Not very clear what the diff tool is attempting to do.

Just looking at the Clojure code, I feel there's better approaches in Clojure to achieve the same or better results.

More clarity would be helpful.

Also, transducers are a big performance win for long sequences of values.

You cannot even begin to guess where your performance problems are until you use something like YourKit (https://www.yourkit.com) which is an excellent tool. With very little effort and a few type hints you can sometimes more than double your Clojure performance.

omn15y ago

Great post! I like articles where the author has practical experience in two languages and compares them. Helps me make better decisions for future projects.

qz25y ago

This is the most HN article I have ever seen.

saleheen5y ago

Great Post!

j / k navigate · click thread line to collapse

76 comments

40 comments · 13 top-level

chrisulloa5y ago· 10 in thread

While I definitely agree Rust is a much faster language than Clojure, I would be interested to see benchmarks on your code that show just how much faster your Rust code was on the same data.

j-pb5y ago

divs12105y ago

Yup, it's like the leadership is actively hostile towards community building.

* Prismatic Schema, immensely popular, was "replaced" by spec, which is not yet complete and still in the research phase

* leiningen (one of the best language tooling out there) was "replaced" by Clojure CLI that can't do half of what leiningen can

* transducers (a brilliant concept) are not easy (as in close at hand) because the code is quite different to normal lazy-sequence based code (I wrote a library [1] to address this)

I still prefer Clojure for all my side projects, but it is very clear that the community is tiny and fragmented.

[1] https://github.com/divs1210/streamer

1 more reply

kostadin5y ago

[0] https://github.com/thi-ng/umbrella/tree/develop/packages/tra...

nbardy5y ago

Debugging performance problems is the reason I stopped using `cljs`. Those stack traces are so painful.

ithrow5y ago

3 more replies

beders5y ago

> reaaaallllyy bad debuggability

Odd, I can just step through my code with Cursive if I need to.

1 more reply

bjoli5y ago

lilactown5y ago

When I was reading the article, I thought the author of the post was probably pointing more in the direction of Clojure's immutable data being slower, rather than laziness specifically.

Transducer's would help in the author's first case to take the map/filter piped through the `->>`, which is going to do two separate passes and realize two seqs, and combine it into one.

1 more reply

adamnemecek5y ago

I mean Clojure is still running on JVM so there will be at least that difference. AFAIK Clojure is slower than Java so there's that also.

1 more reply

dgb235y ago

[0] https://clojure.org/reference/transients

[1] https://github.com/clojure/clojure/blob/clojure-1.10.1/src/c...

didibus5y ago· 5 in thread

fulafel5y ago

> Rust is faster than Java, and Clojure can only ever match Java in performance, not exceed it.

(The word performance of course also has meanings other than code execution speed...)

jcelerier5y ago

> and then there are many back and forth arguments and nobody changes their mind.

How many times have there been rewrites from C++ to Java that ended up being faster ?

2 more replies

dgb235y ago

> Rust is faster than ...

Not if you use the wrong constructs, copy stuff around in the heap, use ref counting everywhere in longer running processes.

I'm not nitpicking here, in Rust you can get really fast, but its on you to make that happen.

For example persistent data-structures (used in Clojure) are really fast and for some operations and cases even close to optimal.

Performance is hard, and I very much agree with your question here. What has been measured and what are the results.

didibus5y ago

1 more reply

burnthrow5y ago

> Clojure can only ever match Java in performance

systems5y ago· 3 in thread

i like how in the end the system was replaced by a database solution

every language need easy access to a query-ble database many problems are a lot simple when solved declaratively as a query-ble database

the relational model, is functional, and is a very good solution to a wide range of problems

I think Chris Date D language can be a place to start to investigate SQL alternative , or as a language that can be more easily emulated in other languages

secondcoming5y ago

'queryable' is a word!

wwweston5y ago

What would you say the advantages of Date's D over SQL are?

systems5y ago

This is from memory, and i saw few examples but what i recall the syntax was less dsl-ish SQL is a DSL, it is not a general purpose language

Chris Dates' D ( not to be confused with Walter Bright D) looks more like a regular programing language

And was functional

So I think an language can add a library or an extension that act like D, or simulate D

Check this to get an idea https://reldb.org/c/wp-content/uploads/2017/12/Rel-and-Tutor...

Also, as I recall, the Notion of RelVars made it sound functional

Each Relational Operator, returned a RelVar So you were passing RelVars between operators until you get the result you want, which was a RelVar

1 more reply

ithrow5y ago· 2 in thread

The articles makes a good a case of how Rust can look as clean as other high-level programming languages when writing high-level business logic.

dgb235y ago

I agree that Rust has some really great concepts. The example code makes heavy use of traits for example, which are very ergonomic and provide a dynamic feel.

The context is a rewrite AKA runtime optimization. So the result is already understood. A great use-case for Rust is top-down implementation.

Also the code doesn't show any of the more painful cases. From the article:

brabel5y ago

burnthrow5y ago· 2 in thread

I have no idea what the author's trying to say here.

mattrepl5y ago

I believe they are saying the static type system and type checker provided by Rust helped them write complicated code.

burnthrow5y ago

Yes but it's not clear how. The post is light on details in general.

1 more reply

kristoft5y ago· 2 in thread

Nice post, very well written! Just not sure why compare non-gc low-level statically-typed language with vm-based gc dynamic language.

nindalf5y ago

It’s a fair comparison because many developers, including the author, would have to make a choice between languages, some of which would be GC and while others are non GC.

higerordermap5y ago

"for writing same tool" so..

SomeHacker445y ago· 1 in thread

I personally was quite astonished (not in a good way) by the use of #_"comments" instead of ;[;[;[;]]] comments.

tincholio5y ago

Also, the emojis in them...

tw254970505y ago· 1 in thread

Tangent: Regarding the power and importance of code structure, I highly recommend watching "Solving Problems the Clojure Way" by Rafal Dittwald at Clojure North 2019 [3].

[1] I didn't see a link, but if it's available, I'd love to take a look.

[3] https://youtu.be/vK1DazRK_a0?t=461

blunte5y ago

That Rafal Dittwald video is excellent. It gives a small but illustrative comparison of procedural, oop, and finally functional... and using javascript (thereby making it accessible to non-lispers).

Most developers should watch it.

Narishma5y ago· 1 in thread

The code snippets in the article are unreadable due to low contrast.

kimi5y ago

Not sure about Rust, but the on the Clojure side, code does not seem pretty.

raspasov5y ago

Not very clear what the diff tool is attempting to do.

Just looking at the Clojure code, I feel there's better approaches in Clojure to achieve the same or better results.

More clarity would be helpful.

Also, transducers are a big performance win for long sequences of values.

omn15y ago

Great post! I like articles where the author has practical experience in two languages and compares them. Helps me make better decisions for future projects.

qz25y ago

This is the most HN article I have ever seen.

saleheen5y ago

Great Post!

j / k navigate · click thread line to collapse