Compiler Development: Rust or OCaml? (opens in new tab)

(hirrolot.github.io)

154 pointsbshanks2y ago101 comments

101 comments

62 comments · 18 top-level

nu11ptr2y ago· 8 in thread

The article has fair points, but after trying OCaml and Rust... I chose Rust. Without going into huge amounts of detail, a compiler is more than simply a parser/ast/code generator and there are other aspects to consider such as the richness of the ecosystem, editor support, etc. Also, I suspect the author is more familiar with OCaml than Rust as you wouldn't typically box everything but likely use an arena for the AST. In the same way, I am more familiar with Rust than OCaml, so some of the warts I observed may be to lack of familiarity. As such I suspect the authors perspective is biased...as is mine. Nothing wrong with that.

ecshafer2y ago

Can you write an equivalent piece of code that shows why Rust wins here with more familiarity and leveraging the ecosystem? With compilers I don't think there is a huge amount of using the ecosystem. I think what TFA does is a good case study in trying to be objective: write it both ways and compare.

nu11ptr2y ago

I'm just saying a compiler is a program, and while the more important things are heavy algorithm related, supporting libraries like those for error handling, etc. all still matter and add up. No problem if you disagree - just my perspective. This isn't so much an objective thing as it is a personal opinion.

1 more reply

wredue2y ago

There was nothing objective about the article. The moment I saw seemingly random new lines in the rust for no reason, I knew there was going to be a “line counts!!!!” Sentence. When there was, I stopped reading because it was apparent that all matter of objectivity is completely missing.

I don’t even like rust.

2 more replies

hurril2y ago

I've been seriously at it with Rust for about six months now and really loving it. What do you mean by "use an arena for the AST?" What is an arena in this context?

fuklief2y ago

I believe it means flattening the AST, here is a nice blog post about this technique https://www.cs.cornell.edu/~asampson/blog/flattening.html

1 more reply

remexre2y ago

https://docs.rs/bumpalo/latest/bumpalo/

bobbylarrybobby2y ago

If you're curious about arena allocators, look at the rust compiler itself. It uses one for all of its allocations as regular heap allocation was not performant enough.

mughinn2y ago

i assume they're talking about an Arena Allocator

https://blog.logrocket.com/guide-using-arenas-rust/

1 more reply

hardwaregeek2y ago· 8 in thread

I skimmed the article, and comparing the two programs, yes, the OCaml one is shorter and more elegant. But it also reads like a math magic spell. There's no type annotations for me to figure out what the heck each term is. The naming conventions lean extremely terse. Perhaps it's my lack of experience with OCaml, but it doesn't feel as legible.

The Rust one reads like...well a program. A program that's not as beautiful, but is very much designed to be taken apart, debugged, improved, etc.

I fully agree that if you're writing pure, recursive, data structure manipulations, OCaml is likely a better fit. It's closer to mathematical notation and I see the elegance in that. But if I were to take that data structure manipulation and turn it into a compiler with thousands of lines that I navigate with an IDE, with logging, with annoying little edge cases, with dozens of collaborators, I'd choose Rust.

grumpyprole2y ago

> There's no type annotations for me to figure out what the heck each term is

You can add additional annotations in OCaml if you want, or just query the type of a term in Merlin.

> a compiler with thousands of lines that I navigate with an IDE, with logging, with annoying little edge cases, with dozens of collaborators, I'd choose Rust.

Why? OCaml supports logging and IDEs. Simple elegant code without the burden of manual memory management, makes it better able to cope with edge cases, being taken apart and refactored etc. Less of the complexity budget has already been spent.

jasinjames2y ago

I like that term "Complexity Budget" a lot. I'm going to steal that one!

brmgb2y ago

I think it’s your inexperience talking here as programs in both languages here use mostly the same naming conventions and abbreviations, the main difference being that Ocaml is its usual terse and to the point while Rust is well far less terse

hardwaregeek2y ago

Yeah isn't that my point? Rust isn't trying to be short or elegant. There's no zen of Rust. There are elegant aspects of Rust, but it's not a central goal. Whereas with OCaml it's trying to be elegant. It's encouraging people to write a program where you read it, go "wait, how does that work?", then re-read it and marvel at the beauty of it.

To be clear, elegance is important. A language absent of elegance would be a bore to write (cough Java cough). But too much elegance and it can eclipse the legibility of the language. No type annotations is elegant. Is it legible? Not in my opinion. But perhaps it is in yours.

1 more reply

liampulles2y ago

Conversely, I'm not familiar with Rust, and Rust looks pretty illegible to me. Probably we need more experience with these languages before we judge.

whimsicalism2y ago

> There's no type annotations for me to figure out what the heck each term is.

If types can be perfectly inferred, I don't see why I wouldn't take advantage of it. Whatever Ocaml IDE you use will easily be able to tell you the types anyways

Karliss2y ago

I have no problems with type interference that happens within limited scope. Once you get into territory of type interference across multiple functions you get a similar problem as c++ template errors (before c++20), where the compiler can tell that types aren't correct but it's not clear where the actual mistake was made: did you pass a wrong value with with wrong type to a function at the top, did you use a wrong function call 5 levels deeper or did you access a wrong property somewhere in the middle. I am not sufficiently familiar with Ocaml to tell how much of a problem it's for the specific example in article, but I have some experience with C++ template errors and I remember similar problem in Haskell if type annotations were used too sparingly. Not every programing language will have as bad looking errors as c++ templates, but even if errors are short it doesn't change the inherent problem that compiler can't know what the intention behind large block of type interfered region was. More explicit types split the type interference into smaller regions which means that error message will be closer to the actual place of mistake made by programmer, it also allows checking each region individually for correctness.

nine_k2y ago

The problem is that the article is not an IDE. It would be nice to add some explicit annotations.

beaub2y ago· 8 in thread

Compilers are in this weird spot where they are really mathematically defined programs (which OCaml excels at implementing), while also having high runtime efficiency as a requirement (the reason why C/C++ are such prominent languages for compilers).

With such requirements, I think a point that is fair to make is that Rust acts as a great middle-ground. It avoids the cost of automatic memory management and provides low-level control while also having a more powerful type system and a more "functional" style.

Brushing off the actual efficiency of the produced binary seems like a huge oversight when dealing with a compiler.

memefrog2y ago

I am not sure that the runtime efficiency of the compiler binary is that important. People like fast compile times, but that is more to do with language design than the choice of language for the compiler.

You could write a compiler for Pascal in Python or another very slow language and it would be faster than a Rust or C++ compiler written in Rust or C++. That is because those languages have designs that make compilation algorithmically slow, while Pascal was designed to be fast to compile.

nine_k2y ago

Almost every compiler is fast on toy-sized programs. E.g. the standard Java compiler is pretty fast, and uses little resources.

It becomes visible when you build a large project: you notice that when you face 100k LOCs, efficiency of every compiler's part starts to matter, and RAM usage may grow to uncomfortable levels if your compiler does not care enough.

1 more reply

fauigerzigerk2y ago

>People like fast compile times, but that is more to do with language design than the choice of language for the compiler.

People like fast compile times and people either like to use (or are forced to use) languages that are inherently slow to compile. That's exactly why compiler performance is absolutely critical.

curling_grad2y ago

> I am not sure that the runtime efficiency of the compiler binary is that important.

If the compiler is for JIT, then efficiency will be important.

bjourne2y ago

The cost of automatic memory management is latency and increased memory usage. In a soft real-time system like a game, the garbage collector may cause lag spikes so you miss the head-shot on your opponent. You also require at least 50% more memory for efficient automatic memory management. Throughput, however, is not one of the costs. You can in fact achieve higher throughput with automatic memory management than with manual memory management.

tester7562y ago

>while also having high runtime efficiency as a requirement (the reason why C/C++ are such prominent languages for compilers

I'd want to believe that compiler engineers really put effort into compilers performance, but I just don't buy it.

LLVM, GCC, MSVC, etc, etc all of them touch C/C++ and are slow as hell

For compilers written in other languages I'd say that still LLVM is the bottleneck

>It avoids the cost of automatic memory management and provides low-level control.

What "low-level control" do you need? It is not firmware development.

Btw: Microsoft rewrote their C# compiler from C++ to C#.

JonChesterfield2y ago

Any compiler that gets used at runtime (branded JIT, usually) ends up growing performance hacks or being written from scratch to run quickly. Javascript is prone to using multiple compilers based on how frequently code was executed. That's also what the whole -O0 -O3 -flto -thin-lto -pgo etc flags are about, granting permissions to burn different amounts of time during compilation.

It's really easy to accidentally write code that walks off a performance cliff on unexpected input, but that's likely to get hacked around if someone reports it as slow compilers do annoy people.

1 more reply

freeopinion2y ago

Did Microsoft rewrite their C++ compiler in C#?

1 more reply

josephg2y ago· 6 in thread

I normally love articles comparing programming languages at real tasks, but this article seems very low quality to me. The author clearly doesn't understand how rust thinks about programs. Instead, they're trying to pretend that rust is an alternate syntax for ocaml and being surprised to find it comes up short.

The same article could easily be written the other way around. We could start with a high performance rust program (which makes use of arena allocators, internal mutation and any other rust features you love) and then try and convert it line by line into ocaml. We would find that many of rust's concepts can't be clearly expressed in ocaml. The ocaml code would end up uglier and measurably slower than rust. And just like that the article would reach the opposite conclusion - that rust is clearly the better language!

But this is silly.

In general, you obviously can't translate between languages line by line like this and expect to have a good time. A beautiful C program is constructed using different ideas than a beautiful Lua program. And a beautiful Ocaml program is very different from a beautiful rust program.

Some obvious examples of ocaml ideas being overapplied to rust in this article:

1. The types don't really need to be wrapped in Rc here.

2. Rust generally prefers mutable imperative code over applicative code. And if you insist on applicative patterns, functions should take a &Foo.

3. Rust code usually doesn't rely on recursion that much, so the lack of guaranteed TCO isn't something people in the community care about.

4. Rust is optimized for runtime performance over code beauty or code size. Of course rust is less elegant looking than a garbage collected language! The trade is that it should also run faster. But where are the benchmarks to make the comparison fair?

The match example is just straight out bad rust code. This code:

    fn eval(term: &Term) -> Value {
        match term {
            Bool(b) => Value::Bool(*b),
            Not(m) => match eval(m) {
                Value::Bool(b) => Value::Bool(!b),
                _ => panic!("`Not` on a non-boolean value"),
            },
            // ... lots more nested matches & panics
        }
    }

Can be flattened, to approximately halve the length of the program like this:

    fn eval(term: &Term) -> Value {
        match term {
            Bool(b) => Value::Bool(*b),
            Not(Value::Bool(b)) => Value::Bool(!b),
            // ... (all other valid patterns)
            _ => panic!("{term} invalid"),
        }
    }

There's an old saying: "Every programming language you learn should teach you to see programs in a new way". Rust is not a crappy alternate syntax for ocaml any more than ocaml is a crappy, alternate syntax for rust. The only thing I learned from the article is that the author doesn't know rust well enough to evaluate it.

nequo2y ago

Your example is more concise but the error message "{term} invalid" is less descriptive than the author's.

What would be the idiomatic way for the function `eval` to provide good error messages in Rust?

rtpg2y ago

I would probably put a trait implementatoin on Value that does `.invert_bool`, and have that panic. That way eval is just `eval(m).invert_bool()` and if it panics it panics.

What you really probably want is to make this a Result type-returning thing, and then have have `not` be a function of type Value -> Result<Value,ErrType>, and then you can do not(eval(m)) and panic at the top-level.

gottlobflegel2y ago

Notice from the definition of `Term`

enum Term { Bool(bool), Not(Box<Term>), ... }

that your code simply does not typecheck. `Not` expects a `Box<Term>`, not a `Value`.

It's also worth noting that one would probably want to consider something like

Not(Not(Bool(true)))

a valid term, which your implementation wouldn't.

josephg2y ago

Ah yep, my mistake. I’ve missed out on the recursive inner evaluation. Traits might work better here, but it’s not so obvious.

In any case, I stand by all the other points I’ve made in my comment.

1 more reply

ben-schaaf2y ago

> Can be flattened, to approximately halve the length of the program like this

No it can't. You're missing the recursive call to `eval`.

nequo2y ago

It looks like if let guards could help with flattening it in nightly but not in stable[1]:

https://rust-lang.github.io/rfcs/2294-if-let-guard.html

[1] https://github.com/rust-lang/rust/issues/51114

CollinEMac2y ago· 4 in thread

Why does it seem like I'm hearing about OCaml all the time now? It could just be frequency bias but it wasn't that long ago that I'd never heard of it and now it seems to be getting a lot of attention online.

DonaldPShimoda2y ago

I think we're cycling back towards a general preference for statically typed languages, for one thing. Additionally, a number of traditionally functional language characteristics have been finding more widespread adoption among popular languages. Putting these together, OCaml is on a short list of languages that are functional and statically typed and, uh, perhaps "intuitive" is the word I want — Haskell is not very intuitive for many people due to its lazy evaluation scheme.

In my opinion, OCaml would see even more widespread use if the documentation were improved. I find it a chore to figure out how to use OCaml well. I also would like to use third-party libraries like Jane Street's Base because they've put a lot of work into providing even more functionality in their standard library, but their documentation is absolutely atrocious (where it exists at all).

OCaml is a mature language but does not have a very supportive ecosystem. I'm hoping the renewed interest will prompt changes there.

UncleOxidant2y ago

> Why does it seem like I'm hearing about OCaml all the time now?

I felt that way about a dozen years ago. These things have cycles, apparently. But they also recently released multi-core OCaml in OCaml 5 which opens some doors for OCaml that were previously not open.

zem2y ago

the ocaml ecosystem has been going through something of a renaissance over the last few years (i believe because the build and package management tooling hit some sort of inflection point with dune and opam respectively), so there's been a lot of increased interest in it. it was (imo, of course) always a very pleasant language to use, and produced small and fast executables; the tooling was what was really holding it back.

59nadir2y ago

To some degree a lot of people are now finding OCaml via dev YouTube influencers who highlight OCaml without actually having used it. There's a lot of enthusiasm, kind of like how lots of people were loving and super excited about Rust without ever using it even for toy programs.

Edit:

I have used OCaml in production and currently I don't see a point to doing it again for the vast majority of problems. From a holistic language + runtime point of view OCaml occupies a space where it's not useful enough from a runtime perspective to replace any of the more convenient languages that exist and not low-level enough to fill the spot of any of the good alternatives in that space. Modularity-wise functors are nice but ultimately plenty of alternatives exist even it the lower-level languages.

With all that said, people should probably use the hell out of it if they're excited. It's a bit tiring seeing the constant stream of misinformation regarding alternatives to OCaml, though. There are good reasons it's losing out in industrial use to even languages like Haskell.

norir2y ago· 3 in thread

Here are the features that I think are most important for compiler development:

1) built in eval -- this allows you to transpile to the host language which is invaluable for writing small tests

2) multiline string syntax -- for evaling more than just one liners

3) built in associative and sequential arrays (for the ast)

4) first class closures

5) panic support (for aborting early from unimplemented use cases)

The AST can be represented as an associative array. Each element type can have a 'type' field and rather than pattern matching, you can use if/else. Performance doesn't really matter for the bootstrap compiler because it will only ever be run on relatively small input sets. To get started, you simply walk the ast to transpile to the host language. The snippet is then evaled in the host language to test functionality. Closures allow you to implement the visitor pattern for each ast node, which allows contextual information to be seamlessly interwoven amongst ast nodes during the analysis/transpilation steps.

Keeping all of this in mind, I have identified luajit as my personal favorite language for compiler development. It checks the boxes above, has excellent all around performance for a dynamic language (particularly when startup time is included -- js implementations may beat it on many benchmarks but almost always have slow start up time relative to luajit) and provides a best in class ffi for host system calls. You can run 5000+ line lua scripts faster than most compilers can compile hello, world.

The other reason I like lua(jit) is the minimalism. Once you master lua (which is possible because of its small size) it becomes very obvious that if you can implement something in lua, you can translate the lua implementation to essentially any other language. In this way, there is a sense in which writing a lua implementation becomes almost like a rosetta stone in which a translation can be produced for nearly any other language. With more powerful languages, it is hard to resist the temptation to utilized features that can't always be easily transported to another language. In other words, lua makes it easy to write portable code. This is true both in the sense that lua can be installed on practically any computer in at most a few minutes and in the sense that the underlying structure of a lua program that transcends the syntax of the language can be ported to another computing environment/language.

Another benefit of transpiling to lua is that your new language can easily inherit lua's best properties such as embeddability, fast start up time and cross platform support while removing undesirable features like global variables. Your language can then also be used to replace lua in programs like nginx, redis and neovim that have lua scripting engines. This of course extends to transpiling to any language, which again should be relatively easy if you have already transpiled to lua.

CapsAdmin2y ago

I chose luajit for my language and while I agree with many of your points I really miss a typesystem. Somewhat ironically I'm working on a typesystem for luajit..

I also wish it was a bit more performant, but here it's likely my medium to high level code and not luajit's fault. However running the test suite in plain Lua seem some order of magnitude slower than luajit, so it's a lot faster than plain Lua at least.

dunham2y ago

I would add pattern matching. I've found this really helpful for manipulating ASTs by matching multiple levels of the tree and pulling out values simultaneously.

CapsAdmin2y ago

Is your project public? I'm curious to see how you would go about writing a transpiler in luajit.

telios2y ago· 2 in thread

I don't quite follow the algorithm here, but I'm not sure the `gensym` Rust implementation works as expected. `RefCell::clone` does not return a copy of the reference; it returns a new `RefCell` with the current `RefCell`'s value, resulting in duplicate IDs. However, a `RefCell` isn't even necessary here, since a `Cell` would do just fine - and you'd pass around a reference to that `Cell` instead of cloning it.

It does feel like the code was ported as-is to Rust, and only adjusted slightly to compile; there are going to be pain points as a result of this process. I suspect this is the source of some of the author's complaints, especially given:

> Although it provides us with a greater sense of how the code is executing, it brings very little value to the algorithm itself.

Rust is, in general, for people who find value in having that information; it is okay to not want to have to worry about ownership, borrowing, safety, etc., but it seems a bit odd to complain about this when that's what Rust is for? If you want to focus on just the algorithm, and not how it's executing, then OCaml is definitely a valid choice.

However, the point about GADTs - can Rust's recently-stabilized GATs not work in the same way? Though I will admit that Rust's GATs don't seem nearly as powerful as OCaml's GADTs in this regard.

zem2y ago

> it seems a bit odd to complain about this when that's what Rust is for

that's the point of the article - rust gives you a lot of low-level control, but if you don't actually need that control then you're paying the cost in ergonomics for nothing.

bilboa2y ago

Exactly. I did get the impression that the author is more familiar with OCaml than Rust. However I don't think they were claiming Rust's greater low-level control makes it inferior to OCaml in general. They're just saying it makes it less suitable for writing compilers, since (in the author's opinion) this level of low-level control isn't necessary for that task.

zerr2y ago· 2 in thread

Real World OCaml book needs a second edition.

hardwaregeek2y ago

There is one! https://dev.realworldocaml.org

zerr2y ago

That was fast! Now thinking, should I make a wish for a second edition of Real World Haskell? :)

yafbum2y ago· 1 in thread

This starts out as a fair comparison but evolves pretty quickly towards a one-sided recommendation for Ocaml. I'm quite sure that there are _some_ advantages of Rust that are not listed here and would be curious to learn more about them too.

eddd-ddde2y ago

Even the written tone feels biased > Here is CPS conversion written in Rust vs > The same algorithm in idiomatic OCaml

Sounds to me like they are comparing bad code and good code, not the languages themselves.

programmer_dude2y ago· 1 in thread

Neither, F# for the win!

foderking2y ago

based f# enjoyer

convolvatron2y ago· 1 in thread

isn't rust kind of a nonstarter for cfg representations which are irreducibly cyclic? yes, one can use the pointers are array-indices thing, but ..

tomjakubowski2y ago

No, it's not. For a toy compiler (or for compiling programs with small CFGs), you can use Rc/Weak to represent cycles. For a "real" compiler you'd be using an arena for allocations anyway, which amounts to pointers-are-array-indices.

yberreby2y ago

As someone who wrote a fair amount of Rust and OCaml code, I have to agree with the author.

While working at Routine (YC W21), I was tasked with porting our core library to iOS to minimize duplication of business logic. This was a lucky opportunity to write something resembling a compiler: it took in schemas described with our in-house data exchange library and generated C (for FFI) and Swift code (for the end users, i.e., iOS developers).

Since Routine uses OCaml for everything (which was a big motivator for joining the company—I wanted to see how that would work out), I wrote it in OCaml. The end result is a 3-5k LOC project. It's by no means a full compiler, but it was lots of fun to write. The language got in the way incredibly rarely. On average, it made my life a lot easier. We did encounter our fair share of issues, mostly due to the cross-compilation tooling[1], third-party libraries, and intricacies of FFI. Those do take their toll on sanity.

I tried my hand at writing small compilers / interpreters in Rust, and the experience was nowhere near as smooth. It was fun, and the runtime performance is definitely there, but the ergonomics aren't the same. I especially miss first-class modules whenever I code in something other than OCaml now.

  [1]: we initially used esy [2], flirted with Nix, and eventually switched to opam-cross-ios [3].
  [1]: https://github.com/esy/esy/
  [2]: https://github.com/ocaml-cross/opam-cross-ios

devit2y ago

The author doesn't seem to have much experience in writing Rust.

For instance, passing RefCell<u32> by value as their code does makes no sense (just use u32...), and the code seems to have a lot of clones, most of which are probably unnecessary, while not having a single instance of the "mut" keyword.

In fact, I'm pretty sure it's completely broken, since their gensym doesn't do what they want, due to their wrong use of clones and refcells (it should take an &mut u32 and just increment it).

And definitely not idiomatic at all.

pjmlp2y ago

If one cares about productivity in typical compiler data structures, naturally OCaml.

medo-bear2y ago

> Lisps can be very flexible, but they usually lack static type safety, opening a wide and horrible door to run-time errors.

People should do basic research before writing something silly like this. Qualifying your statement with 'usually' is just a chicken sh*t approach. Common Lisp and Racket have optional strong typing, leaving the responsibility and choice to the developer. Common Lisp is great for implementing compilers. You also have things like Typed Racket and Coalton. The latter is completely statically typed ala MLTON

https://github.com/coalton-lang/coalton

nine_k2y ago

A lot of people here say that the Rust version is non-idiomatic and likely not even working.

So the answer apparently is simple: between two languages, for an important project use the one which you wield best.

smitty1e2y ago

I was going to ask, but the author answers this in the end:

> Other alternatives to consider is Haskell and various Lisp dialects. If you have already “tamed” Haskell (my congratulations and condolences), probably learning OCaml just for writing a compiler is not going to be worth it; if you have not, OCaml is a much more approachable language.

This is an interesting claim, as I thought Haskell and OCaml were more or less equivalently inscrutable.

auggierose2y ago

I suggest TypeScript.

j / k navigate · click thread line to collapse

101 comments

62 comments · 18 top-level

nu11ptr2y ago· 8 in thread

ecshafer2y ago

nu11ptr2y ago

1 more reply

wredue2y ago

I don’t even like rust.

2 more replies

hurril2y ago

I've been seriously at it with Rust for about six months now and really loving it. What do you mean by "use an arena for the AST?" What is an arena in this context?

fuklief2y ago

I believe it means flattening the AST, here is a nice blog post about this technique https://www.cs.cornell.edu/~asampson/blog/flattening.html

1 more reply

remexre2y ago

https://docs.rs/bumpalo/latest/bumpalo/

bobbylarrybobby2y ago

If you're curious about arena allocators, look at the rust compiler itself. It uses one for all of its allocations as regular heap allocation was not performant enough.

mughinn2y ago

i assume they're talking about an Arena Allocator

https://blog.logrocket.com/guide-using-arenas-rust/

1 more reply

hardwaregeek2y ago· 8 in thread

The Rust one reads like...well a program. A program that's not as beautiful, but is very much designed to be taken apart, debugged, improved, etc.

grumpyprole2y ago

> There's no type annotations for me to figure out what the heck each term is

You can add additional annotations in OCaml if you want, or just query the type of a term in Merlin.

> a compiler with thousands of lines that I navigate with an IDE, with logging, with annoying little edge cases, with dozens of collaborators, I'd choose Rust.

jasinjames2y ago

I like that term "Complexity Budget" a lot. I'm going to steal that one!

brmgb2y ago

hardwaregeek2y ago

1 more reply

liampulles2y ago

Conversely, I'm not familiar with Rust, and Rust looks pretty illegible to me. Probably we need more experience with these languages before we judge.

whimsicalism2y ago

> There's no type annotations for me to figure out what the heck each term is.

If types can be perfectly inferred, I don't see why I wouldn't take advantage of it. Whatever Ocaml IDE you use will easily be able to tell you the types anyways

Karliss2y ago

nine_k2y ago

The problem is that the article is not an IDE. It would be nice to add some explicit annotations.

beaub2y ago· 8 in thread

Brushing off the actual efficiency of the produced binary seems like a huge oversight when dealing with a compiler.

memefrog2y ago

nine_k2y ago

Almost every compiler is fast on toy-sized programs. E.g. the standard Java compiler is pretty fast, and uses little resources.

1 more reply

fauigerzigerk2y ago

>People like fast compile times, but that is more to do with language design than the choice of language for the compiler.

People like fast compile times and people either like to use (or are forced to use) languages that are inherently slow to compile. That's exactly why compiler performance is absolutely critical.

curling_grad2y ago

> I am not sure that the runtime efficiency of the compiler binary is that important.

If the compiler is for JIT, then efficiency will be important.

bjourne2y ago

tester7562y ago

>while also having high runtime efficiency as a requirement (the reason why C/C++ are such prominent languages for compilers

I'd want to believe that compiler engineers really put effort into compilers performance, but I just don't buy it.

LLVM, GCC, MSVC, etc, etc all of them touch C/C++ and are slow as hell

For compilers written in other languages I'd say that still LLVM is the bottleneck

>It avoids the cost of automatic memory management and provides low-level control.

What "low-level control" do you need? It is not firmware development.

Btw: Microsoft rewrote their C# compiler from C++ to C#.

JonChesterfield2y ago

It's really easy to accidentally write code that walks off a performance cliff on unexpected input, but that's likely to get hacked around if someone reports it as slow compilers do annoy people.

1 more reply

freeopinion2y ago

Did Microsoft rewrite their C++ compiler in C#?

1 more reply

josephg2y ago· 6 in thread

But this is silly.

Some obvious examples of ocaml ideas being overapplied to rust in this article:

1. The types don't really need to be wrapped in Rc here.

2. Rust generally prefers mutable imperative code over applicative code. And if you insist on applicative patterns, functions should take a &Foo.

3. Rust code usually doesn't rely on recursion that much, so the lack of guaranteed TCO isn't something people in the community care about.

The match example is just straight out bad rust code. This code:

    fn eval(term: &Term) -> Value {
        match term {
            Bool(b) => Value::Bool(*b),
            Not(m) => match eval(m) {
                Value::Bool(b) => Value::Bool(!b),
                _ => panic!("`Not` on a non-boolean value"),
            },
            // ... lots more nested matches & panics
        }
    }

Can be flattened, to approximately halve the length of the program like this:

    fn eval(term: &Term) -> Value {
        match term {
            Bool(b) => Value::Bool(*b),
            Not(Value::Bool(b)) => Value::Bool(!b),
            // ... (all other valid patterns)
            _ => panic!("{term} invalid"),
        }
    }

nequo2y ago

Your example is more concise but the error message "{term} invalid" is less descriptive than the author's.

What would be the idiomatic way for the function `eval` to provide good error messages in Rust?

rtpg2y ago

I would probably put a trait implementatoin on Value that does `.invert_bool`, and have that panic. That way eval is just `eval(m).invert_bool()` and if it panics it panics.

gottlobflegel2y ago

Notice from the definition of `Term`

enum Term { Bool(bool), Not(Box<Term>), ... }

that your code simply does not typecheck. `Not` expects a `Box<Term>`, not a `Value`.

It's also worth noting that one would probably want to consider something like

Not(Not(Bool(true)))

a valid term, which your implementation wouldn't.

josephg2y ago

Ah yep, my mistake. I’ve missed out on the recursive inner evaluation. Traits might work better here, but it’s not so obvious.

In any case, I stand by all the other points I’ve made in my comment.

1 more reply

ben-schaaf2y ago

> Can be flattened, to approximately halve the length of the program like this

No it can't. You're missing the recursive call to `eval`.

nequo2y ago

It looks like if let guards could help with flattening it in nightly but not in stable[1]:

https://rust-lang.github.io/rfcs/2294-if-let-guard.html

[1] https://github.com/rust-lang/rust/issues/51114

CollinEMac2y ago· 4 in thread

DonaldPShimoda2y ago

OCaml is a mature language but does not have a very supportive ecosystem. I'm hoping the renewed interest will prompt changes there.

UncleOxidant2y ago

> Why does it seem like I'm hearing about OCaml all the time now?

zem2y ago

59nadir2y ago

Edit:

norir2y ago· 3 in thread

Here are the features that I think are most important for compiler development:

1) built in eval -- this allows you to transpile to the host language which is invaluable for writing small tests

2) multiline string syntax -- for evaling more than just one liners

3) built in associative and sequential arrays (for the ast)

4) first class closures

5) panic support (for aborting early from unimplemented use cases)

CapsAdmin2y ago

I chose luajit for my language and while I agree with many of your points I really miss a typesystem. Somewhat ironically I'm working on a typesystem for luajit..

dunham2y ago

I would add pattern matching. I've found this really helpful for manipulating ASTs by matching multiple levels of the tree and pulling out values simultaneously.

CapsAdmin2y ago

Is your project public? I'm curious to see how you would go about writing a transpiler in luajit.

telios2y ago· 2 in thread

> Although it provides us with a greater sense of how the code is executing, it brings very little value to the algorithm itself.

However, the point about GADTs - can Rust's recently-stabilized GATs not work in the same way? Though I will admit that Rust's GATs don't seem nearly as powerful as OCaml's GADTs in this regard.

zem2y ago

> it seems a bit odd to complain about this when that's what Rust is for

that's the point of the article - rust gives you a lot of low-level control, but if you don't actually need that control then you're paying the cost in ergonomics for nothing.

bilboa2y ago

zerr2y ago· 2 in thread

Real World OCaml book needs a second edition.

hardwaregeek2y ago

There is one! https://dev.realworldocaml.org

zerr2y ago

That was fast! Now thinking, should I make a wish for a second edition of Real World Haskell? :)

yafbum2y ago· 1 in thread

eddd-ddde2y ago

Even the written tone feels biased > Here is CPS conversion written in Rust vs > The same algorithm in idiomatic OCaml

Sounds to me like they are comparing bad code and good code, not the languages themselves.

programmer_dude2y ago· 1 in thread

Neither, F# for the win!

foderking2y ago

based f# enjoyer

convolvatron2y ago· 1 in thread

isn't rust kind of a nonstarter for cfg representations which are irreducibly cyclic? yes, one can use the pointers are array-indices thing, but ..

tomjakubowski2y ago

yberreby2y ago

As someone who wrote a fair amount of Rust and OCaml code, I have to agree with the author.

  [1]: we initially used esy [2], flirted with Nix, and eventually switched to opam-cross-ios [3].
  [1]: https://github.com/esy/esy/
  [2]: https://github.com/ocaml-cross/opam-cross-ios

devit2y ago

The author doesn't seem to have much experience in writing Rust.

In fact, I'm pretty sure it's completely broken, since their gensym doesn't do what they want, due to their wrong use of clones and refcells (it should take an &mut u32 and just increment it).

And definitely not idiomatic at all.

pjmlp2y ago

If one cares about productivity in typical compiler data structures, naturally OCaml.

medo-bear2y ago

> Lisps can be very flexible, but they usually lack static type safety, opening a wide and horrible door to run-time errors.

https://github.com/coalton-lang/coalton

nine_k2y ago

A lot of people here say that the Rust version is non-idiomatic and likely not even working.

So the answer apparently is simple: between two languages, for an important project use the one which you wield best.

smitty1e2y ago

I was going to ask, but the author answers this in the end:

This is an interesting claim, as I thought Haskell and OCaml were more or less equivalently inscrutable.

auggierose2y ago

I suggest TypeScript.

j / k navigate · click thread line to collapse