Idiomatic Clojure without sacrificing performance (opens in new tab)

(bsless.github.io)

150 pointsceronman4y ago79 comments

79 comments

45 comments · 12 top-level

MillenialMan4y ago· 14 in thread

The biggest performance issue Clojure has, which isn't mentioned in the article and is fundamentally unsolvable, is that it misses the CPU cache - a lot.

The data structure that drives the immutable variables, the Hash-Array Mapped Trie, is efficient in terms of time- and space- complexity, but it's incredibly CPU cache-unfriendly because by nature it's fragmented - it's not contiguous in RAM. Any operation on a HAMT will be steeped in cache misses, which slows you down by a factor of hundreds to thousands. The longer the trie has been around, the more fragmented it is likely to become. Looping over a HAMT is slower than looping over an array by two or three orders of magnitude.

I don't think there is really a solution to that. It's inherent. Clojure will always be slow, because it's not cache friendly. The architecture of your computer means you physically can't speed it up without sacrificing immutability.

sesm4y ago

It is mentioned in the article, one of the last optimizations done there was switching to array. Also, Clojure’s HAMT was designed with CPU cache in mind and it’s performance characteristics don’t degrade over time. Immutable data structures will be slower than arrays - that’s true, but Clojure standard library works on arrays just fine, as demonstrated in the article.

MillenialMan4y ago

You're right - I missed that, the author does mention arrays having better access characteristics, although he doesn't really explain why HAMTs specifically are slow.

How does Clojure's HAMT avoid fragmenting over time?

> Clojure standard library works on arrays just fine, as demonstrated in the article.

Right, but then you don't have immutability - so you lose all the guarantees that you originally had with immutable-by-default.

2 more replies

jeofken4y ago

I’m familiar with the implementation of HAMTs - if anyone wants to study one in C I recommend https://github.com/Jamesbarford/hash-array-mapped-trie or my polymorphic fork of it https://github.com/fromheten/hash-array-mapped-trie-poly.

Are there any other key/value data structures where insertion and retrieval are less than O(n) in complexity, but where the memory layout is better ordered for cache hits during searches? Maybe good old red-black trees?

_bsless4y ago

I don't think there's an immediately available answer here. The wide branch factor in Clojure's implementation is very iteration-friendly, less so for updates.

Worth checking out are BTrees and Hitchhiker trees, but I think a definitive answer will be implementation dependent even in those cases, i.e. one might win out over the other for a specific branch factor or other tune-able parameters

dgb234y ago

I wonder if the wide branching factor of them gives you some cache friendliness.

However, not every use case can benefit from cache optimization and you can use other data structures. It’s not super useful to make generalizations about performance that way.

j-pb4y ago

Yes-ish: https://db.in.tum.de/~leis/papers/ART.pdf

reitzensteinm4y ago

While I do agree, with pointer chasing down a persistent data structure being basically the worst case use of a CPU, the ease of threading Clojure programs means you can often claw a lot of that penalty back.

MillenialMan4y ago

Threading doesn't compensate for that degree of slowdown, and itself has overhead. You'll get something back, but not much.

2 more replies

_bsless4y ago

Take that with a grain of salt, from experiments run by myself and others I can share this does not scale linearly.

Hope some performance experts can chime in but this might be related to data structure sharing between caches or allocation contention in the JVM.

1 more reply

dan-robertson4y ago

This is entirely hypothesising but I don’t see how these wide trees are awful for the cache. In particular I don’t think they would be much worse than a flat array of objects—the problem is, I think, that objects are references and iterating the things in an array of pointers usually sucks.

For a HAMT, With depth (eg) 2, you should be able to prefetch the next node in your iteration while you iterate the current node of 64 elements. Actually iterating through the objects is going to suck but hopefully you can prefetch enough of them too. (Or maybe hotspot could online things enough to improve memory access patterns a bit to take advantage of ILP). There’s still the problem that you can’t read main memory sequentially except that often all the objects in a HAMT were allocated sequentially so you can read main memory sequentially (as allocation is typically just bumping a pointer, allocation-time-locality tends to correspond to address-locality)

lukashrb4y ago

If I understand you correctly, this i a general problem of functional data structures?

> Clojure will always be slow, because it's not cache friendly.

You always have the option to use the java data structures, for the cases this kind of optimization is needed.

MillenialMan4y ago

Yes, this is a general problem with functional data structures. They have to be fragmented in order to share data. There's also the more nebulous issue that they encourage nesting to leverage the architectural benefits of immutability, which is a complete disaster for cache friendliness.

Replacing the critical path is an option, but that only works for numpy-style situations where you can cleanly isolate the computational work and detach it from the plumbing. If your app is slow because the inefficiencies have added up across the entire codebase (more common, I would argue), that's not an easy option.

2 more replies

fulafel4y ago

A traditional hash table can also be pretty cache unfriendly. I wonder if there are any published measurements that compare these.

gpderetta4y ago

Partially unrolled node based data structures could help. Does clojure use them?

jwr4y ago· 5 in thread

A datapoint from a self-funded entrepreneur, for whom an app written in Clojure and ClojureScript provides a living:

Before you start discussing whether a language is "slow", check if perhaps it is "fast enough".

Apart from trying not to write stupid code, I don't even optimize anymore, because coupled with ridiculously fast servers (bare-metal hardware from Hetzner in my case), it's a waste of time. Things run great, and I'd much rather invest in building complex app features that customers will pay for, and have flexible living code that I can easily rearchitect as needed.

The last time I had to optimize Clojure code was when I wrote a search engine that had to respond to incremental searches in a fairly large E-commerce product databases. But that was also about 12 years ago, so computers were quite a bit slower, and JVMs were not as good as they are today.

I do tend to use transducers quite a bit, because they provide great composability and structure the code nicely, while providing a good performance boost as well (you avoid creating lots of intermediate data structures, which not only saves CPU, but also reduces cache, memory and GC usage) — but I don't worry about performance too much. It's just not a problem that appears on the radar.

TacticalCoder4y ago

> Apart from trying not to write stupid code, I don't even optimize anymore, because coupled with ridiculously fast servers (bare-metal hardware from Hetzner in my case), it's a waste of time.

Inquiring mind wants to know: how do you deploy your Clojure (Script) webapp on your bare server? Just an "uber war" that you drop there? (bare servers at OVH here btw but I know Hetzner is good too)

jwr4y ago

Yes, I used a single uberjar for the first year or so, but then it got a bit more involved, because of delivery optimizations (nginx, brotli, precompressed files, versioned immutable static content with infinite cacheability, etc). So these days it's an uberjar (AOT-compiled) plus a bunch of files, essentially.

The servers are managed using ansible, and I also have a terraform setup for quickly spinning up a development cluster "in the cloud". Ansible takes a list of servers either from a static file or from terraform.

askonomm4y ago

To add to this, you can also compile an uberjar into a static executable with Graal VM. I use that to build CLI apps for example, and it works wonders.

1 more reply

currentoor4y ago

So what does your app do? And what have been the biggest drawbacks from your language choice?

jwr4y ago

It's a PLM/ERP type app for electronics — https://partsbox.com/

Drawbacks, hmm — I'm not sure, really, there is no silver bullet, and I can't see significant problems with the language itself that another language would magically solve. I'm pretty happy with Clojure and ClojureScript and some advantages are hard to replicate elsewhere (such as business logic code being written only once and then shared between Clojure and ClojureScript).

One thing I noticed is I got spoiled by Rich Hickey solving problems for me, and I'd really like him to continue doing so. For example, transducers and core.async were great, but spec isn't quite finished yet. Also, reporting status and anomalies (as in, status of operations, computations, including errors/exceptions) and passing it through channels or foreign encodings is a problem I had to solve myself. I'm pretty sure Rich could come up with a better solution.

The biggest problems that I encounter everyday are not with the language itself, but with the ecosystem. Many libraries could use more maintenance. And the really big problems aren't language-related at all. For example, RethinkDB shutting down. I also wish I didn't have to write the full FoundationDB integration ("language binding") myself.

thom4y ago· 3 in thread

I personally feel that people for whom all the underlying JVM/Java stuff is deeply icky and unidiomatic are missing out on a lot. Clojure has some deeply held principles but also a whole heap of pragmatism and that’s what’s always kept it useful to me.

pjmlp4y ago

That is one of the reasons I dislike guest languages, the communities tend to have a culture of rebuilding the underlying stack, as if not appreciating that it is what allows their favourite language to exist in first place.

gavinray4y ago

What is a "guest language", haven't heard this term used outside of GraalVM before?

From context, it's a language that uses the platform of a host?

IE, JVM languages like Kotlin/Scala/Clojure when Java was the initially intended one? Or CLR languages like VB/F# which are second-class in comparison to C#?

If it is this, the one interesting thing I have to note is that I've found that sometimes Kotlin/Scala will produce better bytecode and performance than what I would have written naively by hand.

Just due to them emitting horror-inducing code/optimizations that is great for performance but never meant for human eyes.

1 more reply

didibus4y ago

Clojure explicitly embraces the host, and it is also one of the first thing people criticizes haha.

    Language built for platform vs language ported-to platform
    - Many new languages still take 'Language as platform' approach
    - When ported, have platform-on-platform issues
    - Memory management, type-system, threading issues
    - Library duplication
    - If original language based on C, some extension libraries written in C don’t come over

There's a lot of time where the core team says, Java already does this well, just use the one from Java.

medo-bear4y ago· 3 in thread

i like in depth optimisation posts like this post. however it starts by expressing disagreement with the statement that idiomatic common lisp is much faster than idiomatic clojure, but in the end doesnt do much to dispell this. all it does is it shows how much in depth knowledge of clojure and jvm you need in order to still be slower than a very simple implementation in common lisp. i myself am quite new to common lisp, and comming from the machine learning and scientific computing background, i am VERY impressed by its speed, interactiveness, and just general neatness of programming. i don't think there exists a language that comes close to this. i wish there were more members from ml and scientific community involved in the common lisp community. in fact there is one person in the clojure community who i would love to try common lisp: https://dragan.rocks/

_bsless4y ago

That's a fair criticism, but indulge this hypothetical, if you will:

- assume the sliding window transducer existed in the standard library (there's an open jira for it)

- assume the programmer is not an expert, but is familiar with stuff like boxed maths in Clojure, which isn't hard to work around, well documented [1], and has a known performance impact

- assume the programmer will avoid using `last`

- assume the programmer is comfortable using transducers

In short, that the transducer exists and the programmer has at least 6 months of experience programming in Clojure full time.

Will you consider the resulting implementation complicated, in depth or non idiomatic? There's always the possibility I'm just used to it so it comes naturally to me, but as the title suggests, I consider the result, before dropping down to arrays, way more idiomatic than the initial implementation.

I don't begrudge CL's speed and implementation, it's the readiness with which Clojure is dismissed as being slow with only surface familiarity with it.

[1] - https://insideclojure.org/2014/12/15/warn-on-boxed/

V1ndaar4y ago

While I write almost all my code in Nim nowadays, lisps have a special place in my heart despite not having written large amounts of lisp code.

I'm a physicist and when checking out CL in the past, it didn't feel like scientific computing was a huge thing in CL. You intrigue me with your post though. Do you have any resources for me to check out?

medo-bear4y ago

my background is also in physics and mathematics. in fact i think common lisp is well suited to a physicist-hacker type mindset. i could probably write a lengthy blog post about why i think this could be, but some of the key appeal factors are:

* unmatched interactive development which could be used for exploration of problems such as dynamical and complex systems

* although it is a very high level language you can perform low level optimizations in common lisp itself

* extremely powerful macro system that allows you to develop in a declarative style to reflect your problem domain. for example you can program in a syntax that is natural to general or special relativity, quantum mechanics, etc.

* fast and free implementations such as SBCL

* the language is standardized so your code will not need to be unnecessarily maintained (this is good for university professors that have repeating long-running courses)

i am not sure what type of references you are asking for but i will list few open source projects that might be interesting to you:

* CLASP - common lisp implementation build on top of LLVM. this project is headed by a computational chemist (prof. schafmeister) [0]

* PetaLisp - for HPC [1]

* MAGICL - Matrix Algebra proGrams In Common Lisp by Rigetti Computing [2]

* Quil - quantum instruction language developed on top of common lisp [3]

* CL-CUDA - use NVIDIA CUDA in Common Lisp programs [4]

* MGL - Common Lisp machine learning library [5]

* Maxima - a computer algebra system developed in common lisp [6]

i think these are plenty to show that there exists serious interest in using common lisp for scientific computing. it is possible that some of the people behind these projects are on HN and maybe they can further expand on this

[0] https://github.com/clasp-developers/clasp

[1] https://github.com/marcoheisig/Petalisp

[2] https://github.com/quil-lang/magicl

[3] https://github.com/quil-lang/quilc

[4] https://github.com/takagi/cl-cuda

[5] https://github.com/melisgl/mgl also https://github.com/melisgl/higgsml

[6] https://maxima.sourceforge.io/

1 more reply

Blackthorn4y ago· 2 in thread

The performance difference between (peek n) and (last n) for vector typed n strikes me as something that really ought to be handled by the implementation, not the programmer since the outputs are equivalent.

I really like clojure, like, a lot, but I might have taken an opposite than intended result from this blog post. It made me feel like the language has a number of performance footguns that don't necessarily need to exist.

geokon4y ago

It seems like an intentional design choice. didibus sorta explained it to me: https://news.ycombinator.com/item?id=28406315

It does feel like a tradeoff that was a bit too harsh. As the post shows, you can work around it... but at the end of the day my code ends up with a ton of (nth 0) and very few (first) calls. It feels a bit ugly.. and it doesn't feel "idiomatic" as in how you learn Clojure from a textbook. Some more discussion of it here: https://clojureverse.org/t/what-do-beginners-struggle-with/5...

I find the threaded stuff much easier to follow. Maybe it's mostly visual and familiarity

And as an interesting extra: https://clojureverse.org/t/x-x-auto-transducifying-thread-ma...

dgreensp4y ago

I agree. Also, avoiding Clojure’s collections entirely is a great way to get better performance, but it’s hard to argue that is idiomatic. This applies doubly to ClojureScript.

creamytaco4y ago· 2 in thread

Pretty sure that this isn't the post's purpose but it makes me want to never touch Clojure. I'm obviously not used to it however I can't help but think that it takes serious mental gymnastics to consider the "optimized" code the author came up with as even remotely good, from readability / comprehensibility / extensibility perspective.

didibus4y ago

    (def array-xf
      (comp
       (sliding-array 8 1)
       (keep (fn [^objects arr]
               (when (> 1000 (unchecked-subtract
                              (long (aget arr 7))
                              (long (aget arr 0))))
                 arr)))))

This is a standard feature of Clojure you learn called a transducer pipeline. It describes a pipeline of transformation for transforming a source:

(sliding-array 8 1)

This says to iterate 8 elements at a time in a sliding window manner, so you grab element 0 to 7, then 1 to 8, then 2 to 9, etc., and where the type of the windows will be an array.

(keep fn)

This says that for each of those sliding window, keep only the non-nil values returned by fn.

(fn [^objects arr] (when (> 1000 (unchecked-subtract (long (aget arr 7)) (long (aget arr 0)))) arr))

Finally, this is the function used by keep. It receives as input an array of each sliding window of 8 elements, and it says when the value of the last element in it minus the first element in it is greater than 1000 return the sliding window, otherwise return nil.

In effect, the whole thing will thus return only the sliding windows for which the value of the last minus first element in the window is greater than 1000.

(sequence array-xf times-v)

array-xf only describes the computation, it is generic to the source, so you need to apply it to a source, that's what this code does. It says to apply array-xf to times-v and give you a sequence of the result.

capableweb4y ago

Are you talking about the lispness/parenthesis of the code or is there something else that makes the code inherently difficult for you to comprehend?

roenxi4y ago· 1 in thread

This does underline the weaknesses of Clojure and performance. For a typical language, the advice is (1) don't call expensive functions and (2) loop less.

For Clojure, we get the wonderful quote "There is a close transducer, partition-all, which has no step arity.".

One of Clojure's biggest weaknesses in practice is that breaking in to those functional structures to figure out where the time is being spent or to debug them is harder than in other languages. This is a natural trade-off of developing a terse and powerful language.

It is notable how much trouble Clojure has had over the years linking a bug back to a line number and a reason. Even with spec.

raspasov4y ago

>>> One of Clojure's biggest weaknesses in practice is that breaking in to those functional structures to figure out where the time is being spent or to debug them is harder than in other languages. This is a natural trade-off of developing a terse and powerful language.

Not that hard if you use something like YourKit. There's also a quite good Clojure library https://github.com/hugoduncan/criterium .

drcode4y ago· 1 in thread

I am someone who uses Clojure a lot and understands transducers, but I would pretty much never consider transducers "idiomatic code".

In my opinion they are a straight-up an optimization and using them definitely reduces code readability for the vast majority of people who would ever read your code (yes, it makes your code less readable even if it sometimes doesn't increase the token count of your code)

hawkice4y ago

I basically agree with the sentiment that, even with the sliding transducer not being in a library, the code he ends up with is far better (using keep w/ the checked - is probably the clearest version). With tests you can get convinced the transducer basically works even if you don't understand it, and it _is_ the idiomatic way of writing a transducer, even if writing them at all is unusual.

eigenhombre4y ago· 1 in thread

For the record, I'm the author of the original post on optimizing Common Lisp that the OP is addressing. The OP is a wonderful deep dive into optimizing Clojure. However, aside from possibly from avoiding the "last" function, and maybe using "keep" rather than "map" + "filter", I personally would avoid some of the strategies he outlines unless I truly needed the performance.

Transducers, for example, have a couple of counter-intuitive features that make them harder (for me, at least) to read... the order of "comp" arguments, for example, is reversed from normal usage. We tend to avoid them at my work unless they are truly needed for performance. In a complicated business domain, we struggle much more with building understanding than with slow code, so readability matters quite a bit.

(I leave aside the question of what is "idiomatic," since it seems pretty subjective, and probably depends somewhat on the crowd you run with.)

As several people have written here, Clojure is fast enough without optimization most of the time. (I leave aside the question of startup time -- a very different discussion.) Though I'm curious how to get good performance, most of the time it matters more to me for the code to be easier to understand than for it to run optimally fast, unless the code creates a bottleneck that has some clear impact on usability.

My goal in writing the original article was to play with optimizing Common Lisp as a learning exercise, rather than to throw down the gauntlet and assert that "Clojure is slow." The author's response was to show how to optimize the Clojure, which was very interesting! Ultimately, both languages have their strengths and weaknesses, and I consider myself lucky to be able to write Clojure for pay.

didibus4y ago

Thanks, I do think your original article just got people curious for the challenge, all good fun in my opinion.

didibus4y ago· 1 in thread

On Slack I gave it a go as well, my naïve idiomatic solution turned out to be 50x faster from the get go.

Times are all from my laptop using the same test input and running on JDK 11.

Baseline smt-8: Execution time mean : 2.386019 sec

My first attempt: Execution time mean : 48.977240 ms

    (defn smt-8'' [times-vec]
      (loop [res (transient []) pointer-1 0 pointer-2 7]
        (if-let [end-element (get times-vec pointer-2)]
          (let [start-element (get times-vec pointer-1)
                time-diff (- end-element start-element)]
            (recur
             (if (< time-diff 1000)
               (conj! res [(subvec times-vec pointer-1 (inc pointer-2))
                           time-diff])
               res)
             (inc pointer-1)
             (inc pointer-2)))
          (persistent! res))))

With performance tweaks: Execution time mean : 23.567174 ms

    (defn smt-8''' [times-vec]
      (binding [*unchecked-math* true]
        (loop [res (transient []) pointer-1 (int 0) pointer-2 (int 7)]
          (if-let [end-element (get times-vec pointer-2)]
            (let [start-element (get times-vec pointer-1)
                  time-diff (- end-element start-element)]
              (recur
               (if (< time-diff 1000)
                 (conj! res [(subvec times-vec pointer-1 (inc pointer-2))
                             time-diff])
                 res)
               (inc pointer-1)
               (inc pointer-2)))
            (persistent! res)))))

Less idiomatic as I'm using an array: Execution time mean : 1.678226 ms

    (defn smt-8'' [^"[J" times-arr]
      (binding [*unchecked-math* true]
        (loop [res (transient []) pointer-1 (int 0) pointer-2 (int 7)]
          (if (< pointer-2 (alength times-arr))
            (let [start-element (aget times-arr pointer-1)
                  end-element (aget times-arr pointer-2)
                  time-diff (- end-element start-element)]
              (recur
               (if (< time-diff 1000)
                 (conj! res [(mapv #(aget times-arr (+ pointer-1 %)) (range 8))
                             time-diff])
                 res)
               (inc pointer-1)
               (inc pointer-2)))
            (persistent! res)))))

All my solutions produce exactly the same result where it includes both the list of elements and their time difference.

didibus4y ago

Correction, you have to do:

(set! unchecked-math true)

globally, wrapping in a binding like I did does nothing, since it needs to be set at compile time. My measurement were with it set globally.

delegate4y ago

Clojure's power is not in its ability to be fast.

Immutability has a price and it's a price we're consciously paying. JVM also has a performance cost, but it's well worth paying due to the wins of running on a virtual machine.

For most boiler-plate tasks, especially where I/O is involved, Clojure is fast enough (eg. handling HTTP requests or talking to a database).

The true power of Clojure, in my opinion, is in how it allows you to think about problems; that machines produce side effects by interpreting a data structure, that the program is one transformation of the input data structure to the target data structure for the desired side effect. You don't think in terms of imaginary objects 'doing' things, you think about the input data structure and how the output data structure should look like, such that if fed to a library function, the desired side effect will occur (eg. the text will be printed or the robot arm will rise).

This mode of thinking about problems is closer to how machines actually work on a higher level and amusingly, 'close to the metal' C++, offers abstractions which take the writer into a fairy tale of class hierarchies, methods that 'do' things to other things, inserting, deleting, mutating everything all over the place. Because that's efficient and fast (and dangerous). But it's also focused a lot on telling the machine 'how' to run; while coincidentally making it do 'what' you want it to do. As a wizard who can put that orchestra with swords and knifes in motion without too much blood spilling on the floors, you can't but feel proud to have achieved it.

Data in -> data out. That's Clojure's way of thinking.

It's a lot simpler way of looking at the problem and maybe makes the `->` feel a bit insignificant and maybe not that important in most cases, as long as it produces the correct output.

The biggest performance hit will be choosing the wrong data structure for solving the problem in a functional way. If that's done correctly, Clojure does offer some tools for squeezing a bit more out of performance critical pieces of code (the articles covers some). Those should be a couple of functions in all your code base; for the boilerplate and domain logic or code involving non realtime I/O, it doesn't matter that much.

If performance is a big concern and you end up writing imperative code in Clojure, I think you'll be better off writing the bits in Java or choose a different language, which is better designed for that purpose.

eggy4y ago

So in the end, a vanilla CL implementation is still slower than all the work that went into the Clojure version? I love SBCL, and I am rediscovering Lisp after playing with a lot of other PLs. This is reassuring to me in my choice to go with SBCL and CL.

j / k navigate · click thread line to collapse

79 comments

45 comments · 12 top-level

MillenialMan4y ago· 14 in thread

The biggest performance issue Clojure has, which isn't mentioned in the article and is fundamentally unsolvable, is that it misses the CPU cache - a lot.

sesm4y ago

MillenialMan4y ago

You're right - I missed that, the author does mention arrays having better access characteristics, although he doesn't really explain why HAMTs specifically are slow.

How does Clojure's HAMT avoid fragmenting over time?

> Clojure standard library works on arrays just fine, as demonstrated in the article.

Right, but then you don't have immutability - so you lose all the guarantees that you originally had with immutable-by-default.

2 more replies

jeofken4y ago

_bsless4y ago

I don't think there's an immediately available answer here. The wide branch factor in Clojure's implementation is very iteration-friendly, less so for updates.

dgb234y ago

I wonder if the wide branching factor of them gives you some cache friendliness.

However, not every use case can benefit from cache optimization and you can use other data structures. It’s not super useful to make generalizations about performance that way.

j-pb4y ago

Yes-ish: https://db.in.tum.de/~leis/papers/ART.pdf

reitzensteinm4y ago

MillenialMan4y ago

Threading doesn't compensate for that degree of slowdown, and itself has overhead. You'll get something back, but not much.

2 more replies

_bsless4y ago

Take that with a grain of salt, from experiments run by myself and others I can share this does not scale linearly.

Hope some performance experts can chime in but this might be related to data structure sharing between caches or allocation contention in the JVM.

1 more reply

dan-robertson4y ago

lukashrb4y ago

If I understand you correctly, this i a general problem of functional data structures?

> Clojure will always be slow, because it's not cache friendly.

You always have the option to use the java data structures, for the cases this kind of optimization is needed.

MillenialMan4y ago

2 more replies

fulafel4y ago

A traditional hash table can also be pretty cache unfriendly. I wonder if there are any published measurements that compare these.

gpderetta4y ago

Partially unrolled node based data structures could help. Does clojure use them?

jwr4y ago· 5 in thread

A datapoint from a self-funded entrepreneur, for whom an app written in Clojure and ClojureScript provides a living:

Before you start discussing whether a language is "slow", check if perhaps it is "fast enough".

TacticalCoder4y ago

> Apart from trying not to write stupid code, I don't even optimize anymore, because coupled with ridiculously fast servers (bare-metal hardware from Hetzner in my case), it's a waste of time.

Inquiring mind wants to know: how do you deploy your Clojure (Script) webapp on your bare server? Just an "uber war" that you drop there? (bare servers at OVH here btw but I know Hetzner is good too)

jwr4y ago

askonomm4y ago

To add to this, you can also compile an uberjar into a static executable with Graal VM. I use that to build CLI apps for example, and it works wonders.

1 more reply

currentoor4y ago

So what does your app do? And what have been the biggest drawbacks from your language choice?

jwr4y ago

It's a PLM/ERP type app for electronics — https://partsbox.com/

thom4y ago· 3 in thread

pjmlp4y ago

gavinray4y ago

What is a "guest language", haven't heard this term used outside of GraalVM before?

From context, it's a language that uses the platform of a host?

IE, JVM languages like Kotlin/Scala/Clojure when Java was the initially intended one? Or CLR languages like VB/F# which are second-class in comparison to C#?

If it is this, the one interesting thing I have to note is that I've found that sometimes Kotlin/Scala will produce better bytecode and performance than what I would have written naively by hand.

Just due to them emitting horror-inducing code/optimizations that is great for performance but never meant for human eyes.

1 more reply

didibus4y ago

Clojure explicitly embraces the host, and it is also one of the first thing people criticizes haha.

    Language built for platform vs language ported-to platform
    - Many new languages still take 'Language as platform' approach
    - When ported, have platform-on-platform issues
    - Memory management, type-system, threading issues
    - Library duplication
    - If original language based on C, some extension libraries written in C don’t come over

There's a lot of time where the core team says, Java already does this well, just use the one from Java.

medo-bear4y ago· 3 in thread

_bsless4y ago

That's a fair criticism, but indulge this hypothetical, if you will:

- assume the sliding window transducer existed in the standard library (there's an open jira for it)

- assume the programmer is not an expert, but is familiar with stuff like boxed maths in Clojure, which isn't hard to work around, well documented [1], and has a known performance impact

- assume the programmer will avoid using `last`

- assume the programmer is comfortable using transducers

In short, that the transducer exists and the programmer has at least 6 months of experience programming in Clojure full time.

I don't begrudge CL's speed and implementation, it's the readiness with which Clojure is dismissed as being slow with only surface familiarity with it.

[1] - https://insideclojure.org/2014/12/15/warn-on-boxed/

V1ndaar4y ago

While I write almost all my code in Nim nowadays, lisps have a special place in my heart despite not having written large amounts of lisp code.

medo-bear4y ago

* unmatched interactive development which could be used for exploration of problems such as dynamical and complex systems

* although it is a very high level language you can perform low level optimizations in common lisp itself

* fast and free implementations such as SBCL

* the language is standardized so your code will not need to be unnecessarily maintained (this is good for university professors that have repeating long-running courses)

i am not sure what type of references you are asking for but i will list few open source projects that might be interesting to you:

* CLASP - common lisp implementation build on top of LLVM. this project is headed by a computational chemist (prof. schafmeister) [0]

* PetaLisp - for HPC [1]

* MAGICL - Matrix Algebra proGrams In Common Lisp by Rigetti Computing [2]

* Quil - quantum instruction language developed on top of common lisp [3]

* CL-CUDA - use NVIDIA CUDA in Common Lisp programs [4]

* MGL - Common Lisp machine learning library [5]

* Maxima - a computer algebra system developed in common lisp [6]

[0] https://github.com/clasp-developers/clasp

[1] https://github.com/marcoheisig/Petalisp

[2] https://github.com/quil-lang/magicl

[3] https://github.com/quil-lang/quilc

[4] https://github.com/takagi/cl-cuda

[5] https://github.com/melisgl/mgl also https://github.com/melisgl/higgsml

[6] https://maxima.sourceforge.io/

1 more reply

Blackthorn4y ago· 2 in thread

geokon4y ago

It seems like an intentional design choice. didibus sorta explained it to me: https://news.ycombinator.com/item?id=28406315

I find the threaded stuff much easier to follow. Maybe it's mostly visual and familiarity

And as an interesting extra: https://clojureverse.org/t/x-x-auto-transducifying-thread-ma...

dgreensp4y ago

I agree. Also, avoiding Clojure’s collections entirely is a great way to get better performance, but it’s hard to argue that is idiomatic. This applies doubly to ClojureScript.

creamytaco4y ago· 2 in thread

didibus4y ago

    (def array-xf
      (comp
       (sliding-array 8 1)
       (keep (fn [^objects arr]
               (when (> 1000 (unchecked-subtract
                              (long (aget arr 7))
                              (long (aget arr 0))))
                 arr)))))

This is a standard feature of Clojure you learn called a transducer pipeline. It describes a pipeline of transformation for transforming a source:

(sliding-array 8 1)

This says to iterate 8 elements at a time in a sliding window manner, so you grab element 0 to 7, then 1 to 8, then 2 to 9, etc., and where the type of the windows will be an array.

(keep fn)

This says that for each of those sliding window, keep only the non-nil values returned by fn.

(fn [^objects arr] (when (> 1000 (unchecked-subtract (long (aget arr 7)) (long (aget arr 0)))) arr))

In effect, the whole thing will thus return only the sliding windows for which the value of the last minus first element in the window is greater than 1000.

(sequence array-xf times-v)

capableweb4y ago

Are you talking about the lispness/parenthesis of the code or is there something else that makes the code inherently difficult for you to comprehend?

roenxi4y ago· 1 in thread

This does underline the weaknesses of Clojure and performance. For a typical language, the advice is (1) don't call expensive functions and (2) loop less.

For Clojure, we get the wonderful quote "There is a close transducer, partition-all, which has no step arity.".

It is notable how much trouble Clojure has had over the years linking a bug back to a line number and a reason. Even with spec.

raspasov4y ago

Not that hard if you use something like YourKit. There's also a quite good Clojure library https://github.com/hugoduncan/criterium .

drcode4y ago· 1 in thread

I am someone who uses Clojure a lot and understands transducers, but I would pretty much never consider transducers "idiomatic code".

hawkice4y ago

eigenhombre4y ago· 1 in thread

(I leave aside the question of what is "idiomatic," since it seems pretty subjective, and probably depends somewhat on the crowd you run with.)

didibus4y ago

Thanks, I do think your original article just got people curious for the challenge, all good fun in my opinion.

didibus4y ago· 1 in thread

On Slack I gave it a go as well, my naïve idiomatic solution turned out to be 50x faster from the get go.

Times are all from my laptop using the same test input and running on JDK 11.

Baseline smt-8: Execution time mean : 2.386019 sec

My first attempt: Execution time mean : 48.977240 ms

    (defn smt-8'' [times-vec]
      (loop [res (transient []) pointer-1 0 pointer-2 7]
        (if-let [end-element (get times-vec pointer-2)]
          (let [start-element (get times-vec pointer-1)
                time-diff (- end-element start-element)]
            (recur
             (if (< time-diff 1000)
               (conj! res [(subvec times-vec pointer-1 (inc pointer-2))
                           time-diff])
               res)
             (inc pointer-1)
             (inc pointer-2)))
          (persistent! res))))

With performance tweaks: Execution time mean : 23.567174 ms

    (defn smt-8''' [times-vec]
      (binding [*unchecked-math* true]
        (loop [res (transient []) pointer-1 (int 0) pointer-2 (int 7)]
          (if-let [end-element (get times-vec pointer-2)]
            (let [start-element (get times-vec pointer-1)
                  time-diff (- end-element start-element)]
              (recur
               (if (< time-diff 1000)
                 (conj! res [(subvec times-vec pointer-1 (inc pointer-2))
                             time-diff])
                 res)
               (inc pointer-1)
               (inc pointer-2)))
            (persistent! res)))))

Less idiomatic as I'm using an array: Execution time mean : 1.678226 ms

    (defn smt-8'' [^"[J" times-arr]
      (binding [*unchecked-math* true]
        (loop [res (transient []) pointer-1 (int 0) pointer-2 (int 7)]
          (if (< pointer-2 (alength times-arr))
            (let [start-element (aget times-arr pointer-1)
                  end-element (aget times-arr pointer-2)
                  time-diff (- end-element start-element)]
              (recur
               (if (< time-diff 1000)
                 (conj! res [(mapv #(aget times-arr (+ pointer-1 %)) (range 8))
                             time-diff])
                 res)
               (inc pointer-1)
               (inc pointer-2)))
            (persistent! res)))))

All my solutions produce exactly the same result where it includes both the list of elements and their time difference.

didibus4y ago

Correction, you have to do:

(set! unchecked-math true)

globally, wrapping in a binding like I did does nothing, since it needs to be set at compile time. My measurement were with it set globally.

delegate4y ago

Clojure's power is not in its ability to be fast.

Immutability has a price and it's a price we're consciously paying. JVM also has a performance cost, but it's well worth paying due to the wins of running on a virtual machine.

For most boiler-plate tasks, especially where I/O is involved, Clojure is fast enough (eg. handling HTTP requests or talking to a database).

Data in -> data out. That's Clojure's way of thinking.

It's a lot simpler way of looking at the problem and maybe makes the `->` feel a bit insignificant and maybe not that important in most cases, as long as it produces the correct output.

eggy4y ago

j / k navigate · click thread line to collapse