After using it for a couple of weeks I am confused why ML/OCaml aren't more popular. They are safe, functional, stable, fast, and have great tooling. They seem poised to take over the functional domain.
While the syntax took a little getting used to ( emphasis on little) once you are used to it, it's very natural. Union types are wonderful, and the implicit type safety within programs was nice.
The beginner material is kind of true as well, although I do believe that the little beginner material that exists for OCaml (pretty much just the official documentation/Learn OCaml and Real World OCaml) are much easier to get started with than the tutorials of many other languages. OCaml's learning material is pretty short and to-the-point, and I found it to be a good set of boot-strap knowledge: I pretty much learned the core language in 3-4 days or so and then went on to explore the various libraries, tools, and language features not covered by the basic tutorials at my own pace.
Mutability, OO and various other feature are all there just when you need them. You don't need, like in Haskell, to do incredible contortions to be able to express things naturally.
Regardless which algorithm and API you want, there is a pretty good chance you can express it in OCaml naturally, and it'll almost always be reasonably efficient by default.
Also, everyone underestimate modules a lot. They're the best software development tool in any language by a long shot.
I'd say the MLs are functional-first (with imperative/OO on top). Like Ruby OO-first (with some functional on top).
For me this type of multi-paradigm is ok. It starts to hurt when all the paradigms are "first", which I see in Scala.
For me, the issue is the GIL, although that is being worked on as we speak.
Now, the ML family of languages (SML, OCaml, F#) is technically a family of multi-paradigm, functional-first languages, but that doesn't help with clearing the "functional" hurdle in popular perception.
That's my experience with it anyway.
I've written plenty of OCaml, and I can't see how this would ever be a problem. Are you writing 500 line functions in OCaml? It seems like it would be difficult to write such a long function in OCaml. An why would pattern matching cause it?
Maybe I'm misunderstanding something.
Using functional languages without Lisp-like macros will always be sort of weird to me, like I'm missing out on something.
[1] https://github.com/dannywillems/ocaml-for-web-programming
Actually it has two OCaml->JS compilers of very high quality The first one, js_of_ocaml, could bootstrap the whole compiler several years ago(probably the first one there).
The recent one, https://github.com/bloomberg/bucklescript, push the JS compilation into next level, it generates fairly readable code, good FFI story, and its compilation is extremely fast, check out the compiler in JS version(http://bloomberg.github.io/bucklescript/js-demo/), and imagine how fast it would be for the compiler in native version. BuckleScript has a good story for Windows, and generates fairly efficient code, see benchmark here: https://github.com/neonsquare/bucklescript-benchmark BuckleScript is already used in production by big companies: for example Facebook messenger.com 25% is powered by BuckleScript, the new WebAssembly spec interpreter by Google is also partly cross compiled into JS by BuckleScript.
Disclaimer: I am one of the authors of BuckleScript
- My coeffects page (http://tomasp.net/coeffects) is an implementation of a simple ML-like language with coeffect type system. It was written using FunScript, which is a precursor of Fable - Fable improved many things, but this was over a year ago when it was not around yet.
- The Gamma (https://thegamma.net) is a web-based language for doing simple data science work and the compiler for that is written all in Fable. It works perfectly and integrates neatly with things like virtual-dom (source code is on GitHub https://github.com/the-gamma/thegamma-script)
It's able to self-host as well. Check it out at http://fable.io/repl
F# also runs on .NET Core, which is cross-platform and comes with a good CLI. Documented, too: https://docs.microsoft.com/en-us/dotnet/articles/fsharp/tuto...
What's wrong with the OCaml syntax? It's much more clean than say scala's one, it's indentation insensitive, and a' list feels more relevant than the list<'T>
Jump in fellas, the language is powerful, and the community is nice!
(I'm not interested in a Scala vs Ocaml language comparison. I know both languages very well. I'd be interested in the quality of JS support.)
What do you mean? Scala.js allows you to use things from the Java ecosystem in JavaScript?
First of all, OCaml/SML are the best choice in terms of example code for compilers. They're historically the choice of many compiler/interpreter/type theory texts (Types and Programming Languages, Modern Compiler Implementation in ML, and an ML is even used as a language to interpret in Essentials of Programming Languages). Andrej Bauer's PLZOO is also written in OCaml. Equally important is the fact that there are a variety of ML implementations, all of which are much more approachable than GHC. The OCaml compiler's codebase is a reasonable size that an individual could get a good idea of how it works in a few weeks or so. SMLNJ, MLKit, MLton, CakeML are all open source and on Github, and all seem to be fairly approachable in comparison to the monolith that is GHC. And that's not even mentioning other compiler in ML (Haxe, Rust's bootstrap compiler, Facebook's Hack compiler, etc.). The fact that there are real-world compilers with perfectly approachable code bases (even without great familiarity with the language; compilers in Haskell might require an in-depth understanding of many of the core type classes and language extensions available) that are open source is highly attractive to novice compiler writers.
Additionally, the feature set in MLs is a good choice for compilers. While they lack some of the cooler features of Haskell, MLs make up for it in simplicity; lots of the features in GHC's type system (especially with language extensions) mean very little for 90% of compiler writers, and getting rid of them from the get-go helps keep the code small and easy to reason about (even if you won't have as much type safety in the compiler itself). This also means that there are a lot less ways to do a single thing, which can be nice when you're not sure exactly how you're going to implement a certain feature. However, one thing I really find incredibly useful is OCaml's polymorphic variants. These are pretty much perfect for statically enforcing a nanopass-like system in your compiler and are a great way of designing your AST/data types in your compiler. I feel like this gets passed up a ton (as far as I know I'm the first person who's used them to create nanopasses), but it's quite convenient and makes OCaml a good competitor for Scheme in this regard.
> it suffers from the Perl-ish woe of write-once and read-never.
My experience with Haskell is opposite, I think Haskell yields very maintainable code that is largely self documenting and allows me to confidently hack around old code bases.
My experience with Perl is the same. Very hard to read back, maintain or get productive on old code bases.
Haskell is a language like any other. Many people would like to complicate it, but if you spend the time learning its syntax and semantics, there is very little need to learn the theory.
My "problems" with OCaml started, when I wanted to "map" over a data structure I defined. I ended up having to define custom mapping functions for all container-like data structures I wrote and call them in a non-polymorphic fashion (where I would have just used fmap in Haskell).
Sure, in OCAML I needed to use a parser generator where I would have used megaparsec in haskell, but it was also a tolerable inconvenience.
Trouble started when I needed to track state in the compilation process. I.e. I was generating variable names for temporary results and values, and I needed to track a number that increased. In the end I used a mutable state for it, and it turned out nightmarish in my unit tests.
After a while, I just ported the code base to Haskell and never looked back. The State monad was an easy fix for my mutable state issues. Parser combinators made the parser much more elegant. And many code paths improved, became much more concise. It is hard to describe, but in direct comparison, OCaml felt much more procedural and Haskell much more declarative (and actually easier to read).
The only advantage of OCaml to me is the strict evaluation. I don't think lazy evaluation by default ins Haskell is a great idea.
I believe the answer to both questions is zero. Unfortunately there is pedagogical cruft in the community that makes it appear this way. main :: IO () is comparable to public static void main(args[]) or whatever nonsense in Java.
Nice, I was just asking about this on the nanopass list the other day, do you happen to have a publicly available example of this anywhere?
1. The GC part is true, but one has to remember that this was written at a time when GC was still a bit of an unusual feature in mainstream languages.
2. Tail recursion doesn't really make much of a difference for walking trees, which is recursive, but (mostly) not tail recursive.
3. OCaml in particular uses 63/31-bit ints due to implementation details, which isn't a good fit for 64/32-bit integers. The strings and bignum part is mostly right, though.
4. ADTs can be good or bad for describing ASTs. Once you enrich ASTs with semantics shared by all variants (such as source coordinates), inheritance can become a better fit than ADTs.
8. Type inference doesn't really extend to module signatures, which you have to write out explicitly (though tooling such as `ocamlc -i` allows you to let the compiler help you write them). I also generally find it better to explicitly annotate functions with types. Not only does it make the code more readable later in its life, but you get fewer truly impenetrable type error messages because you forgot parentheses or a semicolon somewhere.
That said, there are several good points still.
Unless you, as the article notes, "know how to take advantage of it". Here's a fully tail-recursive binary tree traversal in OCaml:
type 'a tree = Leaf of 'a | Branch of 'a tree * 'a tree
let iter f tree =
let rec iter_rec f worklist tree =
match tree with
| Leaf a ->
(* Perform the action on this element. *)
f a;
(* Consult the worklist for more things to do. *)
begin match worklist with
| [] -> ()
| next_tree::worklist' -> iter_rec f worklist' next_tree
end
| Branch (left, right) ->
(* Visit the left subtree, save the right for visiting later. *)
iter_rec f (right::worklist) left
in
iter_rec f [] tree
Usage example: let mytree =
Branch (Branch (Leaf 1, Leaf 2),
Branch (Leaf 3, Branch (Leaf 4, Leaf 5)))
let () = iter (Printf.printf "%d\n") mytree
Yes, people do write traversals like this in OCaml, though with less verbosity than this example I whipped up.> 3. OCaml in particular uses 63/31-bit ints due to implementation details, which isn't a good fit for 64/32-bit integers.
I think the article means here that you just use int for all the kinds of numerical identifiers that compilers give to things like instructions, basic blocks, pseudo-registers, etc., without doing the kind of micro-optimization that C++ programmers would do, guessing whether the number of blocks is safe to store in an unsigned short etc.
For representing constants from the program, which is what you seem to be referring to, the article does suggest using bignums, not OCaml's native ints.
This is a depth-first search with an explicit stack (the stack is tree :: worklist). You can do the same in an imperative language. Tail recursion here is only an extra-complicated way of writing a simple loop, and you're adding extra complexity by having two variables to represent the stack. The same code can be written just as (if not more) compactly in an imperative language.
Non-strictness helps here more than TCO in a strict language.
>4. ADTs can be good or bad for describing ASTs. Once you enrich ASTs with semantics shared by all variants (such as source coordinates), inheritance can become a better fit than ADTs.
Since this article was written we have better ways of augmenting/annotating ASTs. There's a lot of this out there, but here's one example: https://brianmckenna.org/blog/type_annotation_cofree
There are other alternatives that are like inheritance but with better reasoning properties as well. Finally tagless comes to mind.
>I also generally find it better to explicitly annotate functions with types.
This Haskeller whole-heartedly agrees for all the reasons stated.
Can you explain? Assume I want to fold a function over a large tree and fully inspect the final result. (For example, to compile a large expression to a piece of code.) If I use non-tail recursion, my stack will be exhausted. How does non-strictness help with stack usage?
In Haskell, my current FP language of choice, I can implement a complex transform such as Lambda lifting in a few 10's of lines of readable idiomatic code.
Second, you're confusing inheritance with the ability to map subtypes to operations (and in statically typed languages, in a type-safe fashion). This is a function of OCaml's (or SML's, or Haskell's, or F#'s) pattern matching facilities, not of inheritance vs. ADTs. It can also be done with typecase statements, multi-methods (or actually, just external methods), or tree parsers. The tree parser approach in particular is more general and powerful than the typical pattern matchers in functional languages.
Third, if you look at actual compilers, such traversal will commonly be done in an ad-hoc fashion and can be done equally well with bog-standard methods. Where you have generalized traversal mechanisms, the visitor pattern will crop up in OCaml, too (in some guise or another). Examples are the Ast_mapper module for PPX in OCaml itself [1] and the visitor interface in CIL [2]. The reason is that if you want to perform a generalized fold, map, etc. operation over a heterogeneous data structure such as an AST (visitor is usually fold + map due to destructive updates), you need to also provide a set of operations for the various types that you can encounter during traversal.
[1] https://caml.inria.fr/pub/docs/manual-ocaml/libref/Ast_mappe...
[2] https://people.eecs.berkeley.edu/~necula/cil/api/Cil.cilVisi...
I'm not sure what "isn't a good fit" is supposed to mean.
It's not going to make it impossible (you may just need something like a ShortIntLiteral and a LongIntLiteral variant), but it's going to require additional effort.
You can't use the coding style used for recursive descent in the Dragon compiler book, without using mutable variables.
Do you have to use parser combinators, which have their own limitations?
In any case, Ocaml has parser generators that are fast, do bottom-up parsing (hence handle left-recursion without issue) and not based on parser combinators, e.g. ocamlyacc [3].
I'd use parser combinators for quick prototypes, and, if measurement shows a performance problem, replace them with a generated (e.g. by ocamlyacc) parser. As far as I remember the parser in Ocaml's (superfast) compiler is generated by ocamlyacc.
[1] https://en.wikipedia.org/wiki/Left_recursion
[2] T. Ridge, Simple, functional, sound and complete parsing for all context-free grammars. http://www.tom-ridge.com/resources/ridge11parsing-cpp.pdf
[3] https://caml.inria.fr/pub/docs/manual-ocaml/lexyacc.html
Javascript | Reason
--------------+----------------------------
const x = y; | let x = y;
let x = y; | reference cells
var x = y; | No equivalent (thankfully)P.S. I'm not really familiar to ML/OCaml, but have decent experience with large code bases in languages that are not very keen to protect you from yourself.
Speaking as a Haskell programmer, never use exceptions. You can get away with this advice because the Either monad allows you to have the behavior of exceptions (namely, at any point you can "fail" a computation and have the error automatically propagate up to the handler). However, this approach relies heavily on having a type system more advanced than OCaml's in order to be reasonable.
I personally think that Ocaml is really good at this, because I started converting the Scheme examples from the PLAI book to Ocaml and it's just felt right(maybe because I'm not fan of the scheme syntax).