undefined | Better HN

0 pointskerkeslager3y ago0 comments

I appreciate your polite tone here. To expand on this at the risk of sounding a bit rude: nobody should listen to anyone who speaks about performance in terms of reasoning about a system instead of profiling it.

Computers are shockingly complex. I can't tell you how many times I've reasoned about a system, ran the profiler, and discovered I was completely wrong.

When I was working on an interpreter for a Lisp, I implemented my first cut of scopes (all the variables within a scope and their values) as a naive unsorted list of key/value pairs, thinking I'd optimize later. When I came back to optimize, I reimplemented this as a hashmap, but when I ran my test programs, to my horror, they were all 10x slower. I plugged in a hashmap library used in lots of production systems and got a significant 2x performance gain, which was still slower than looping over an unsorted list of key/value pairs. The fact is, most scopes have <10 variables, and at that size, looping over a list is faster than the constant time of a hashmap. I can reason about why this is, but that's just fitting my reasons to the facts ex-post-facto. Reasoning didn't lead me to the correct answer, observation did.

Returning to parallel data structures, the fact is, I don't know why lock-free structures are faster than mutex-based structures, I just know that they are in every situation where I've profiled them.

Reasoning isn't completely useless--reasoning is how you intuit what you should be profiling. But if you're just reasoning about how two alternatives will perform and not profiling them in real-life production systems you're wasting everyone's time.

0 comments

3 comments · 2 top-level

slaymaker19073y ago· 1 in thread

I don't think that's quite correct because there are many optimizations which are impossible (or nearly impossible) to do after a system is implemented. Daniel Lemire wrote an excellent post on this exact subject https://lemire.me/blog/2023/04/27/hotspot-performance-engine....

In terms of programming languages, I think python is an excellent example of a language which has many features that have ended up making it extremely difficult to optimize even compared to other dynamic languages like LISPs. Even if you don't have to worry about backwards compatibility, there are design decisions that can limit performance which end up necessitating a rewrite of the entire system to actually change.

kerkeslagerOP3y ago

> I don't think that's quite correct because there are many optimizations which are impossible (or nearly impossible) to do after a system is implemented.

Who said you have to profile after a system is implemented? Certainly I didn't: if anything, I prefer to profile during prototyping, although few companies outside the largest budget for any real prototyping these days it seems. Usually I settle for timing things and profiling as early as possible so that you can catch any performance issues before any calcifying structure is built around the non-performant code.

Yes, I did profile after the fact in the Lisp story, but my point in that story was that my reasoning led me to the wrong conclusions, not that I did everything perfectly (on the contrary, it's a story about learning from my mistakes!).

I agree that Daniel Lemire post is excellent, but nothing in that post leads me to believe he'd disagree with anything I've said.

kazinator3y ago

Optimizing Lisp lexical environments with hashes is going to be a fool's errand. The way you optimize lexical environments is by compiling them, so that there is no searching at all.

The value in the interpreter is that it provides an alternative implementation of the semantics. This becomes particularly valuable in some area that happens to be under-documented, and the compiler and interpreter are found to disagree.

It can be easier to get the semantics right in the interpreter. A simple implementation of environments (and whatnot) reduces the likelihood of bugs and leaves the code readable. Interpreted behavior of special forms can usually serve as the reference implementation for compilation.

If an interpreter is slow, that just means that the build steps for bootstrapping the compiler using the interpreter takes longer.

j / k navigate · click thread line to collapse