(for maybe the second to last time)
Congratulations on the merge, and thank you for all the work that went into it!
It is hard or even counter-productive to try to fit them on a roadmap.
The multicore project is(was?) a bit unusual from the point of view of OCaml development since it is a massive engineering effort with a focused team.
Nevertheless, I hope that we continue and extend the "OCaml compiler bimonthly" news to give more information about the ongoing work on the compiler.
Maybe a before and after, but one that focus on expressiveness, what you were not able to say/express in OCaml before, that is now* possible
not just speed or performance
*s/not/now
https://github.com/ocaml-multicore/effects-examples has links to tutorials and examples for how effects can be used.
There's also some slides from KC's talk on effect handlers https://kcsrk.info/slides/handlers_edinburgh.pdf and materials from the CUFP 17 tutorial: https://github.com/ocamllabs/ocaml-effects-tutorial
https://gopiandcode.uk/logs/log-bye-bye-monads-algebraic-eff... this is also a great introduction
We've written a couple of papers detailing the internals and the trade-offs involved: https://arxiv.org/abs/2004.11663 (for parallelism) and https://arxiv.org/abs/2104.00250 (for effects)
Added to that is the complexity of tracking a moving target. Multicore had to be rebased through 12 releases of OCaml, which in itself was a non-trivial amount of work.
> (for maybe the second to last time)
How come?
Whether it's beneficial or not is unclear though.
This course text is also great, I don't know if it's listed at ocaml.org/learn or not:
In many popular languages today (e.g. C++) if you have a data race you're completely screwed, (e.g. the program exits successfully, even though it was supposed to loop forever serving TCP requests - good luck figuring out why). In Java they decided OK, that's not acceptable, so data races are constrained to only the data touched by the race and, importantly, that data is still legal it just might be astonishing (e.g. you were adding several small positive integers together from a shared data structure in parallel, but due to a misunderstanding in your design this was actually a data race, and some time later somehow your total is now zero, but you won't crash or whatever)
OCaml intends to further constrain the consequences in time, if the total was 114 when you stopped adding, it will still be 114 later, it won't mysteriously become zero (or any other value) thanks to a data race which must have happened before you checked.
[I'm sure I have some details wrong, but this is the gist]
What remains to be seen is: Is that enough? There was great hope when Java made its rules that they were enough and programmers could understand what was wrong in a Java program with data races, that did not pan out as I understand it. So, it seems to me that it's possible OCaml ends up in the same situation.
Could you clarify why/how this would happen? Is it because the last process sets it to zero to initialize a variable?
I can’t conceptualize the steps that would end up with this result if all processes are adding. It seems like it would at least equal the result of the final thread’s calculation.
The details of the 'LDRF' (local data race freedom) property are described in detail here: https://anil.recoil.org/papers/2018-pldi-memorymodel.pdf
The performance numbers are in the paper abstract: "our evaluation demonstrates that it is possible to balance a comprehensible memory model with a reasonable (no overhead on x86, ~0.6% on ARM) sequential performance trade-off in a mainstream programming language". It's a little higher on PowerPC but still very usable, and RISC-V overheads should be roughly comparable to ARM.
For this reason (and others), forking child processes has been a common alternative on many OCaml projects.
See here for an update re: the whole multicore initiative, and links to more information:
https://discuss.ocaml.org/t/multicore-ocaml-december-2021-an...
Even in Rust, no one really knows at this point what is safe to do in a fork from a multithreaded program or in a signal handler in one. Signal handlers and forks are thus simply “unsafe” in Rust with “care must be taken”, but there is no real explanation either of what care, and Rust does not document or stabilize which of it's functions are async safe, as C does.
[1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/fo...
The description is in the first paragraph of the PR body text:
> This PR adds support for shared-memory parallelism through domains and direct-style concurrency through effect handlers (without syntactic support). It intends to have backwards compatibility in terms of language features, C API, and also the performance of single-threaded code.
P.S There were ways around it, like calling an extern C function which will spawn a new theead and call and pass the control back, but those were almost never used.
There's also now https://github.com/talex5/lwt_eio, which allows you to run existing Lwt code alongside code using effects, to aid with porting.
- https://github.com/ocaml-multicore/eio#readme for more information on the Eio library - https://watch.ocaml.org/videos/watch/74ece0a8-380f-4e2a-bef5... is a short talk on experiences using effects with some nice motivating examples
Note that Eio is more than just a direct replacement for Lwt and Async. We couldn't resist using some of the experiences also gained from the MirageOS (mirage.io) unikernel framework in EIO. This means that the backends are highly optimised to use the best syscalls available in the OS (e.g. io_uring by default in Linux). If you write your applications to use Eio natively, then performance is very high so far. The ergonomics of programming in it also compare favourably to using monadic concurrency.
ie. will the httpaf server library need to import it as an external dependency like it currently does with lwt, or will EIO already be available?
Credit to Sam Westrick for turning me on to this [1].
[0] https://github.com/athas/raytracers
[1] https://twitter.com/shwestrick/status/1480587660691480579
PR to Merge Multicore OCaml - https://news.ycombinator.com/item?id=29638152 - Dec 2021 (155 comments)
Past related threads:
Multicore OCaml: October 2021 - https://news.ycombinator.com/item?id=29238972 - Nov 2021 (12 comments)
Effective Concurrency with Algebraic Effects in Multicore OCaml - https://news.ycombinator.com/item?id=28838099 - Oct 2021 (59 comments)
Multicore OCaml: September 2021, effect handlers will be in OCaml 5.0 - https://news.ycombinator.com/item?id=28742033 - Oct 2021 (3 comments)
Multicore OCaml: September 2021 - Effect handlers will be in OCaml 5.0 - https://news.ycombinator.com/item?id=28719088 - Oct 2021 (3 comments)
Adapting the OCaml Ecosystem for Multicore OCaml - https://news.ycombinator.com/item?id=28440385 - Sept 2021 (1 comment)
Adapting the OCaml Ecosystem for Multicore OCaml - https://news.ycombinator.com/item?id=28373155 - Aug 2021 (21 comments)
Multicore OCaml: July 2021 - https://news.ycombinator.com/item?id=28039219 - Aug 2021 (14 comments)
Multicore OCaml: May 2021 - https://news.ycombinator.com/item?id=27480678 - June 2021 (27 comments)
Multicore OCaml: April 2021 - https://news.ycombinator.com/item?id=27140522 - May 2021 (89 comments)
Multicore OCaml: Feb 2021 with new preprint on Effect Handlers - https://news.ycombinator.com/item?id=26424785 - March 2021 (29 comments)
Multicore OCaml: October 2020 - https://news.ycombinator.com/item?id=25034538 - Nov 2020 (9 comments)
Multicore OCaml: September 2020 - https://news.ycombinator.com/item?id=24719124 - Oct 2020 (43 comments)
Parallel Programming in Multicore OCaml - https://news.ycombinator.com/item?id=23740869 - July 2020 (15 comments)
Multicore OCaml: May 2020 update - https://news.ycombinator.com/item?id=23380370 - June 2020 (17 comments)
Multicore OCaml: March 2020 update - https://news.ycombinator.com/item?id=22727975 - March 2020 (37 comments)
Multicore OCaml: Feb 2020 update - https://news.ycombinator.com/item?id=22443428 - Feb 2020 (80 comments)
State of Multicore OCaml [pdf] - https://news.ycombinator.com/item?id=17416797 - June 2018 (103 comments)
OCaml-multicore now at 4.04.2 - https://news.ycombinator.com/item?id=16646181 - March 2018 (4 comments)
A deep dive into Multicore OCaml garbage collector - https://news.ycombinator.com/item?id=14780159 - July 2017 (89 comments)
Lock-free programming for the masses - https://news.ycombinator.com/item?id=11907584 - June 2016 (29 comments)
Lock-free programming for the masses - https://news.ycombinator.com/item?id=11893911 - June 2016 (4 comments)
OCaml 4.03 will, “if all goes well”, support multicore - https://news.ycombinator.com/item?id=9582980 - May 2015 (113 comments)
Multicore OCaml - https://news.ycombinator.com/item?id=8003699 - July 2014 (1 comment)
Popular projects often get submissions of every step of the release lifecycle, from major commits, to PRs, to merges into main branches, to alpha releases and beta releases and GA releases. These are all significant milestones for people who care about the project, and the project is deservedly popular! But from an HN point of view, they are distinctions without differences, because the underlying topic of discussion is always the same—the project in general.
I wrote a detailed explanation about just this sort of situation here: https://news.ycombinator.com/item?id=23071428.
Further explanations at these links:
https://hn.algolia.com/?dateRange=all&page=0&prefix=false&so...
https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
https://hn.algolia.com/?dateRange=all&page=0&prefix=false&so...
Work on this is on-going via the sandmark benchmarking suite: https://github.com/ocaml-bench/sandmark
In short the expectation should be that single-threaded code performs roughly the same (single digit percentage changes) as on the sequential runtime.
Parallel code on multicore can see close to linear speedups on 64 cores, though it depends significantly on your workload. If you're interested in parallelising existing OCaml code, I gave an example-driven OCaml workshop talk in 2020: https://www.youtube.com/watch?v=Z7YZR1q8wzI
[0]: https://github.com/ocaml-bench/sandmark [1]: https://github.com/ocaml-multicore/retro-httpaf-bench/pull/1...
Personally, I've also been lucky enough to have worked for two different employers (not in the finance or compiler space) over the past 2 years where I've mostly used OCaml for writing things that many people might consider "boring", namely lots of web services, database drivers, data processing, etc.
How does OCaml compare?
From a quick skim of papers, it seems it provides sequential consistency when using atomics and acquire/release semantics when not.
Sounds like a pretty bad design.
(author of the said paper here) This is wrong. I suggest reading the paper closely. If not, the morning paper has a good summary [1,2].
[1] https://blog.acolyer.org/2018/08/09/bounding-data-races-in-s...
[2] https://blog.acolyer.org/2018/08/10/bounding-data-races-in-s...
It's not obvious to me from those articles how what I said is inaccurate.
In other words, I am keen to not reproduce the announcement that "multicore might be in 4.03" from 5 years ago.
I have been interested in OCaml for a while (mainly because of ReasonML), never really got to it unfortunately, but maybe this is the time.