This is a great achievement for OCaml, but does anyone have an explanation on why it was so difficult to implement for them?
He stated that albeit he spent years working with Lisps (CL and Racket mostly) or Haskell he stated that jumping on projects in those languages instantly requires you pay the price of having to learn the abstractions and DSLs that the users wrote for the project before being able to understand anything.
He compared it with C or Go, where he realized that it was easier to him to read kernel code without any context, because there is no abstraction price to pay. What you see is what there is to understand.
He basically says that those languages (MLs, Lisps) are great for personal projects or very small teams but they don't scale well, this also reflects on open source where many will build their libraries and programs, but very few get into collaborating in those communities.
IMO, OCaml is significantly easier to ramp up on than Haskell. The object model and the fact that the language allows you to ramp up on imperative code let you write working code and the language is feature rich enough you're not generally writing a DSL. Granted, I think it is still harder to ramp up programmers than in JS, Python, Java, or Go.
I think OCaml's biggest problem has been one of timing and value proposition. The language matured to some level of production readiness in the mid 2000s at which point multicore started to become important.
If you wanted extreme performance, OCaml increasingly couldn't compete with C++ or Java. Most shops don't the robustness the superior type system could provide and the other downsides of an unpopular language always loomed large when adoption was being considered.
I still think OCaml or something OCaml-like might become popular with time, but it will require the kind of improvements the multicore project is hinting at providing and more.
In F#, even for some DSLs, there's not much funny business at all. You have types and functions. I'm not as familiar with OCaml, but in F#, the only really confusing thing along the lines of DSLs or macros are computation expressions. But for the most part, you don't need them aside from the built-in `async` one.
For industrial applications in most deployment environments Haskell is a much more practical language than OCaml, due to having many more modern runtime features (green threads, STM, etc., that for many years have made the lack of multicore support in OCaml seem embarrassing). It's not really that that gap has been caught up to now either, OCaml is still many years behind Haskell in basic runtime features.
F# obviously has a practically useful and featureful runtime, but has a scheduler that makes it easy to get thread exhaustion, whereas Haskell has a preemptive one that will make that a non-issue.
I find that the "practical language" argument is usually used by people who have never used either OCaml or Haskell for solving real world problems. In practice OCaml is a did-not-finish versus the comparatively (to most other languages, notably losing to the BEAM languages) excellent finishing time of Haskell.
To paraphrase: The idea being that a well designed library api gets you 90% of the benefits of a richer language’s features, but without exactly that penalty that Carmac calls out.
It does create dialects and potential silo effect. I never worked in real CL projects but books and articles mention to not abuse macros and DSLs because of that, old lispers are also often very educated .. and they rarely do things for trivial reasons. I'm regularly surprised by how well thought out things are.
I don't get it. In C you're programming with structures, data types and functions. In ML and Lisp, you're programming with structures, data types and functions. Lisp lets you muck with syntactic forms so maybe that has some obscuring effect, but I expect most programs are written in fairly direct style. Where people do add abstractions, it's to reuse code so there's ostensibly less code to understand overall.
IIRC, Mike was also doing a bunch of stuff with a Jabber client in OCaml around that time.
Being able to use C# libs is very nice but also it sucks that I don't have true null safety.
Who knows, maybe F# is always more performant than OCaml and it doesn't make sense to switch back.
That's actually on the way right now. The main concurrency library which is designed for OCaml 5, eio, internally uses effects but doesn't require its users to be exposed to them. Users write code in direct style and don't need to care about internal implementation with effects.
That stuff all gets a bit screwy in Ocaml, an int has 31 bits of information but is 32 bits in size, a record with two int 64s ends up being twice you expect, etc.
https://github.com/ygrek/mldonkey (Stale project but large codebase.)
https://akabe.github.io/ocaml-jupyter/
https://github.com/moby/vpnkit (Used by Docker)
It's a code-syntax aware large-scale search-and-replace tool. E.g.,
comby -matcher .scala -review 'foo(:[x], :[y])' 'foo(:[x])'
This will search in the current directory tree for all files that contain the code pattern 'foo(x, y)' and replace it with 'foo(x)', using Scala syntax rules. It's super convenient for doing large-scale codemods. E.g. https://github.com/tinymce/rescript-webapi/pull/40- https://github.com/mirage/mirage
- https://github.com/returntocorp/semgrep
- https://github.com/bcpierce00/unison
See https://v2.ocaml.org/learn/companies.html for some more leads, as lots of those companies maintain useful OSS software.
[1] https://www.fftw.org/fftw3_doc/Generating-your-own-code.html
https://github.com/rust-lang/rust/tree/ef75860a0a72f79f97216...
I've recently completed bugfixing/testing on 4.14.1+no-naked-pointers, and 5.0 compatibility is not far behind (we're usually 1 or 2 compiler versions behind latest, e.g. current production releases are built using 4.13.1)
Disclaimer: I work on the XAPI project as part of my job, the project itself is >15 years old at this point.
I'm super curious about OCaml, picked it up and left it several times in the last 3 years. Now that multicore is here I'll absolutely be picking it up again and try to use it for parallel scripting and for some of my work. Great job!
The paper we wrote on retrofitting effect handlers: https://arxiv.org/abs/2104.00250 also has some http benchmarks
There are other cool stuff that is being worked on, which I am very excited about: https://discuss.ocaml.org/t/jane-street-compiler-development.... Hopefully, we will see many of these make it into OCaml 6.
I'm more ambivalent regarding the local allocations and the unboxed types. I totally understand why they'd be useful when you are trying to squeeze every last drop of performance, but they do require a not-so-trivial complexification of the language.
Right now, I am not even taking a guess of what will be the defining new major features of OCaml 6 (effect system + modular implicits maybe? Maybe not?).
The results section of the paper only compare the performance of Multicore OCaml with plain OCaml, not OCaml vs C++ / Java.
Great work Ocaml devs!
I also don't think F# is as fast as OCaml?
Fairly relevant piece of culture on why a company switched from OCaml to F# https://blog.darklang.com/new-backend-fsharp/
might be now in there.
You don’t have to deal with the typical need to determine what packages to use as you would in ocaml. I have aspnet and co. I’m okay with OOP leaking here and there when I get most of what I need already: an ML tool with an huge ecosystem.
This looks much more interesting, on a skim. Thanks!
Look at Eio here: https://github.com/ocaml-multicore/eio
I think eventually it will hopefully turn into something like what languages like purescript have which would be really cool.
(I've only used ocaml a tiny bit and use f# a lot more but I keep periodically checking the status of this because it's something that would make ocaml a lot more interesting to me.)
Multicore / multiprocessor systems were not a mainstream thing in consumer hardware until the 21st century.
But even in late 90s it was still common for desktop Win9x apps to use the main window message loop for async processing (Win32 API itself heavily encouraged it at the time - e.g. that's how OS timers work) in lieu of threads.
You do know that C (edit - didn't have multithreading in the language spec until C11, right?)
It was common on languages of that era. Also Ocaml has had libraries for multithreading for many years, just like C has POSIX threads and things...
https://signalsandthreads.com/what-is-an-operating-system/
He talked about the work to put a multicore-ready memory model[1] and GC[2] under OCaml.
[1] https://anil.recoil.org/papers/2018-pldi-memorymodel.pdf
Hoping they'll get more content out soon.
(Was a very enjoyable episode though!)
Anil covers it in a bit more plain English in this Signals and Threads episode. Here's a bit of the transcript, starting at about 50 minutes in: https://signalsandthreads.com/what-is-an-operating-system/
> Ron: Do you have a pithy example of a pitfall in multicore Java that doesn’t exist in multicore OCaml?
> Anil: There’s something called a data race. And when you have a data race, this means that two threads of parallel execution are accessing the same memory at the same time. At this point, the program has to decide what the semantics are. In C++, for example, when you have a data race, it results in undefined behavior for the rest of the program, the program can do anything. Conventionally, daemons could fly out of your nose is an example of just what the compiler can do.
> In Java, you can have data races that are bounded in time so the fact that you change a value can mean later on in execution, because of the workings of the JVM, you can then have some kind of undefined behavior. It’s very hard to debug because it is happening temporally across executions of multiple threads.
> In OCaml, we guarantee that the program is consistent and sequentially consistent between data races. It’s hard to explain any more without showing you fragments of code. But conceptually, if there’s a data race in OCaml code, it will not spread in either space or time. In C++, if there’s a data race, it’ll spread to the rest of the codebase. In Java, if there’s a data race, it’ll spread through potentially multiple executions of that bit of code in the future.
> In OCaml, none of those things happen. The data race happens, some consequence exists in that particular part of the code but it doesn’t spread through the program. So if you’re debugging it, you can spot your data race because it happens in a very constrained part of the application and that modularity is obviously essential for any kind of semantic reasoning about the program because you can’t be looking in your logging library for undefined behavior when you’re working on a trading strategy or something else. It’s got to be in your face, at the point.
(and so on)
Specifically how the “better” crowd spent a lot of time trying to solve PCLSRing while the “worse” crowd just said: meh, throw an error and let the program figure it out.
Previously
Successful implementations were made in several languages now.
[1] https://ats-lang.sourceforge.net/DOCUMENT/INT2PROGINATS/HTML...
I now run mainly on windows and this is an issue for me to try OCaml
opam switch create 5.0 --repos=dra27=git+https://github.com/dra27/opam-repository#windows-compilers --packages=ocaml.5.0.0,ocaml-option-mingw
I felt an ergonomic and modern syntax is the only missing piece in Ocaml.
def example():
x = 5
print("Hello world")
What's the mistake here? Depending on whether the print is part of the function, it should either be indented or have a newline before it. The point is you (and any formatting tool) can't know what the horizontal alignment of this code should be just by examining the vertical line order. You can only determine this by knowing (or reanalyzing) the semantics of the code. During a refactor where you're moving around lots of code, this can be a significant PITA. However, in the JS example, function example() {
let x = 5
console.log("Hello world")
}
it's unambiguous what the mistake is because you can determine the correct formatting entirely from the line order, without having to know anything about the code's semantics.Adding spurious curly braces to make OCaml look more like JS is not 'ergonomic and modern' syntax, it's a step backwards if anything (though I can appreciate there is some sense to that in context of a compile-to-JS language for frontend dev)
Threads belong to a domain and only one thread can hold the runtime lock for the domain. This is the same behaviour as in OCaml 4.
With OCaml 5 you can have as many domains as you want though (we recommend no more than you have cores though).
And once a program (and all its dependencies) have removed dependence on global state they can opt-in to multicore by spawning additional domains.
It's semantics is weaker than STM -- unlike STM, it doesn't provide serializability but Reagents can compile down to multi-word compare and swap operations, which can be implemented with the help of hardware transactions (when present) or efficient software implementations of it [2]. Hence, Reagent programs should be faster than STM.
[1] https://github.com/ocaml-multicore/reagents [2] https://arxiv.org/pdf/2008.02527.pdf
https://sixthhappiness.github.io/articles/python-scheme-and-...
Does anyone have any multicore benchmarks that illustrate performance increases in 5.0?
https://github.com/samuell/gccontent-benchmark
:D
Wondering if anyone can comment on whether this might mean OCaml can be a contender in that space now?
What is the use cases for OCaml?
opam switch create 5.0 --repos=dra27=git+https://github.com/dra27/opam-repository#windows-compilers --packages=ocaml.5.0.0,ocaml-option-mingw
I know this one [0] so far. Also the famous Coursera PL course [1] covers ML.
I wonder how this will affect ReScript
Modern MMUs don't require full TLB flushes when switching address spaces, and physically tagged cache lines allow the ring buffers to be shared in cache even if they're mapped at different virtual addresses in the different processes. Also, you're guaranteed to not have false sharing, and there's zero contention for locks within malloc/free.
I don't mean to suggest there are no advantages to full address space sharing, but they're both fewer and less than one would initially assume.
Congratulations to the team!
What are the next plans for the project? Spread newly available features..?
In terms of the compiler and runtime development, the OCaml and ML Workshops at ICFP in October have videos that cover some of the experimental work happening: https://watch.ocaml.org/video-channels/ocaml2022/videos and https://www.youtube.com/playlist?list=PLyrlk8Xaylp7f8T7L5SFF...
There's also a compiler development newsletter that's posted on the discuss at regular intervals which details some of the other work happening: https://discuss.ocaml.org/t/ocaml-compiler-development-newsl...
$ opam update
$ opam switch create 5.0.0 --repositories=default
$ eval $(opam env)
$ ocaml
OCaml version 5.0.0
Enter #help;; for help.I know this one [0] so far. Also the famous Coursera PL course covers ML.
Any plans to backport its design back to Opam?
Just Hotwire Strada left.
At first glance, I understood the title "OCaml 5.0 Multicore is out" to mean "Multicore is no longer part of OCaml 5.0".