There's also the Quasar library that adds fiber support to existing Java projects, but its mostly unmaintained since the maintainers were pulled in to work on Project Loom.
Then there's Project Loom, an active branch of of OpenJDK with language support for continuations and a fiber threading model. The prototype is done and they're in the optimization phase. I expect fibers to land in the Java spec somewhere around JDK 17.
I figure its fair to mention these as the authors criticisms are somewhat valid but will not be for very long (few years max?)
In summary: Java will have true fiber support "soon". This will invalidate the arguments for Erlang concurrency model. They are already outdated if you are okay using mixed java/kotlin coroutines or Quasar library
The newer Java GC's Shenandoah and ZGC address authors criticisms of pause times. They already exist, are free, and are in stable releases. Dare I say they are almost certainly better than Erlang's GC. They are truly state of the art, arguably far superior to the GC's used in Go, .NET, etc. Pause times are ~10 milliseconds at 99.5%ile latency for multi terabyte heaps, with average pause times well below 1 millisecond. No other GC'ed language comes close to my knowledge. His points 1 and 2 no longer exist with these collectors. You don't need 2X memory for the copy phase and the collectors quickly return unused memory to the OS. This has been the case for several years.
Hot code reloading. JVM supports this extensively and its used all the time. Look into ByteBuddy, CGLIB, ASM, Spring AOP if you want to know more. Java also supports code generation at build time using Annotation Processors. This is also extensively used/abused to get rid of language cruft
What about failure domains? As far as I'm concerned, this is the strongest reason for actor-based concurrency. I can design my architecture so that groups of processes that need to die together die together. And it's usually one or two lines of code, if any.
Here's a real life example. I have a process that maintains an SSH connection to a host machine, and that ssh connection is used to query information about running VMs on that host machine. If the SSH connection dies, it kills the process that is tracking the host machine, which in turn kills the processes tracking the associated VMs, without perturbing any of the other hosts' processes or vms. This triggers the host process to be restarted by a supervisor, which then creates a new SSH connection to query for information (possibly repopulating VM processes for tracking information). All of this I wrote zero lines of code for (which, importantly, means I made no mistakes), just one or two configuration options. More importantly, the system doesn't get stuck in an undefined state where complex query failures can cause logjams in the running system.
In Java you would create a thread pool and configure it to restart the threads if they die. Each thread would wake up every so often to query SSH and dump their results into a queue. If the query threads die, the processes reading the queue at the other end have nothing to do so they won't execute. Its easy to make a consumer queue that executes some code on another thread whenever data arrives.
Java's exposure of the underlying OS threads and cheap transfer of data between threads lets people build libraries on top that offer memory models used by Erlang and others. Its not built in or quite as convenient, but you can use actors and fibers in Java if you want to.
The failure domain here isn't precisely defined because shared data is allowed (but not required). You could define it as "anything reachable from the thread/fiber stack".
It would allow you to write something like:
try (var scope = FiberScope.open(Option.PROPAGATE_CANCEL)) {
var fiber1 = scope.schedule(() -> sshKeepAlive());
var fiber2 = scope.schedule(() -> trackHost());
var fiber3 = scope.schedule(() -> trackVMs());
}
With the garantee that if any fiber fails (which you bind to cancelling it), all others will be cancelled.Does Java's Hot code reloading support data migration? One benefit of Erlangs model is that you can execute hooks when HCR is performed to make sure your data in memory is migrated to a new format.
But really, the most important thing about Erlangs actor model is error handling. If I spin up a process in Erlang and it fails, it won't corrupt the state of my other processes. In Java this can only be attained through disipline since all memory is shared. Also, I can very easily specify which processes should work together as units, such that if one fails, they all fail, and can be restarted together from a known working state. This, again, requires discipline in Java.
Not sure what you mean by data migration on code reloading. I suspect the mechanisms are different enough that it can't be compared. With Java you can load arbitrary new code, but changes to existing code are limited in ways that prevent data incompatibilities. For example you can add fields to existing object but you can't change the type of existing ones.
Data corruption from threading is rare in Java. I can't remember the last time I ran into it. Its easy to do but everyone is used to threads and the concurrency implementation is one of the best I've used. Java also supports thread groups to ensure that threads die and get restarted together. Its not automatic, you need to manage the groups, but I think it achieves the same.
As a counter-point, I've been working on a platform for the last few years which uses Kotlin and Quasar in production. Quasar was cool at first but now it's just a nightmare and I wish we never opted to use it. It leaks abstractions all over the place with @Suspendable annotations and users of the platform find the quasar related errors super confusing. Debugging is also very difficult because of Quasar. On the other hand, Kotlin is great!
If I could turn back the time, I'd build the messaging/async workflow part of the platform using Erlang. I've mentioned this to a few people but they all think I'm mad... "Erlang... are you on drugs?!", which is disappointing because it's literally perfect for our use case.
Why didn't you use Kotlin coroutines? My understanding is that they achieve the same as Quasar without the insanity.
You may also want to look at Vert.X. Its evolved into a lot more than a REST framework. It uses thread-per-core and nonblocking to achieve high performance instead of green threads. It theoretically performs better because there's not a lot of stacks hanging around and only 1 thread per core. There's a lot of callbacks though, so if you're not used to RxJava style chaining its hard to get used to. Its very much like Node.
Erlang or Go would be the easiest if you need a lot of threads. If you just need high performance with a lot of connections, Vert.X may suffice. Java IO in recent years is fully non-blocking so you don't need a lot of threads for high concurrency. Vert.X can handle millions of concurrent clients, enough that you will need to adjust your kernel to hit its limits. And its built on Netty which is rock solid.
Java's new garbage collectors, ZGC and Shenandoah, have average pause times of 0.3 milliseconds on heaps less than 4GB. I find it unlikely that another language has pause times shorter than that given the sheer amount of work put into Java GC over the years
The biggest problem with Erlang is that hardly anything out there needs this level of concurrency and robustness in a single system - in the new world of microservices and serverless architectures there are other ways to cope with scaling. This is the main selling point, and unfortunately in all other areas Erlang is significantly outdated and refuses to evolve - even less so than the Java language which is a dinosaur in itself.
Having said that I think Erlang is a fantastic teaching tool and should be on everyone's bucket list of "things to learn in this life as a software engineer".
I wondered about this myself before I started using Elixir. In practice, it turns out when it's cheap to make things concurrent more services take advantage of this feature.
Tests and the elixir compiler are extremely fast because of this, and it makes the whole development experience better.
Because the primitives are so simple, people experiment more which makes better software. Nobody would come up with phoenix live view for Play framework in their spare time because play framework is so overly complicated.
I did an interview at a major streaming company and one critical part was written in Erlang. It has been working great but the guy who wrote it had left for some time and nobody knew Erlang there, so they would have to rewrite it if an update was needed.
I never understood this often repeated point. As junior / mid-level developer I had the privilege to run self written .jar files on government scale systems with more than 50 cores. I used Java thread pools and concurrent data structures to do heavy cross thread caching.
It was all pretty simple and concurrency & parallelism were never an issue but simply a necessity to make things run fast enough.
Am I a concurrent programming genius? Were the types of problems/challenges I was solving too simple? When is concurrency in Java ever hard+?
+ I know about Java masterpieces like the LMAX Disruptor that are mostly beyond my skill level, but those are low level writte-once libraries you wouldn't write yourself.
Potentially-racey stuff:
* Synchronized primitives don't compose. You can safely `synchronized get(...)` and safely `synchronized put(...)`. But their composition put(get(...)+1) isn't synchronized. And it's hard to mentally revisit it at the end of the day: if you have a class with some methods marked synchronized, nothing will tell whether you've synchronized the right methods. You just have to think it through again and hope you reach the same conclusions as before.
Other (non-racey) stuff:
* Threads are heavy, CompletableFutures are light. But CFs lack the functionality of Threads. A CF can't decide to sleep for a while, nor can it be cancelled. (As an aside, BEAM threads are super light).
You can achieve arbitrary non-blocking delays by using the cruft scheduled thread executor or doing it sanely with RxJava. Really its just dangerous to do nonblocking stuff in Java without a wrapper like RxJava. That's not a good thing, I look forward to the day there's real fibers
Since JDK8: https://docs.oracle.com/javase/8/docs/api/java/util/concurre...
Clojure Concurrency - Rich Hickey https://www.youtube.com/watch?v=dGVqrGmwOAw
Even though the talk is called clojure concurrency, first half of the talk is about the problems clojure solving in traditional concurrency.
one my favorite talks I ever went to.
A lot of developers are not aware of what thread is going to execute their code, or of what that implies (I think it takes practice, at least it did for me), and in my experience it often leads to shared mutable state without proper guards, or deadlock hell from locks being created all over the place in hope to make things safe, or other nightmares.
>I know about Java masterpieces like the LMAX Disruptor that are mostly beyond my skill level
Both the basic idea of the Disruptor, and its simplest implementation (mono publisher, mono subscriber), are pretty simple: just using minimal memory barriers to write and read data cycling on an array, and (busy-)wait whenever you bump into whoever is ahead (the publisher if you're the subscriber, or the subscriber if you're the publisher).
Quoting one of its authors:
« Sometimes we have absolutely no choice and we need to go parallel and use a lot of concurrency. If you do, get people in who are good at it. And actually, I found most of the people who are really good at it, their instinct is they'll do it as an absolute last resort, because they know how complicated it actually gets. There is a scottish comedian called Billy Connolly [who said]: "people who want to own a gun, or be a politician, should be automatically barred from either of them." And I think it's the same with concurrency: anybody who just wants to do it should not be allowed. » (https://www.infoq.com/presentations/top-10-performance-myths)
Take a look at dated, but still relevant book by Brian Goetz - Java Concurrency in Practice - many problems are illustrated with a code section.
yes and no. Yes - in the sense that pretty equivalent things can be done in different languages. No - i'm in the same group of "geniuses" as GP, and i see for example on our current huge C++ platform project the highly technical people struggle with and do the wrong things with concurrency/multithreading that i don't remember seeing the even mildly technical people doing on various large Java projects.
Without more information this is the likely scenario, going by my own experience.
BTW, if it turns out you are a concurrent programming genius please write about it, eh? (Like a blog or book or something.)
The article states benchmark of 5000% speedup on floats when switching from BEAM to the JVM. I would like to offer $100 as a gift incentive to anyone here who wants to work on optimizing BEAM math.
But why can't it be both? Why can't you do everything that BEAM does... and then also have an optimising JIT for the straight line maths code? Couldn't you leave all the other parts of the system the same and keep all the existing benefits? Improving one doesn't damage the other does it?
The problem with number crunching or maths is that it is very difficult to cut the whole computation into smaller units and pre-emptively schedule it. If it is possible for a specific use case, then it is moderately easy to replace that part with NIFs. For effective maths you need to convert the internal tagged number representation to machine native code that is also expensive. Solving these two things in the generic case is very difficult while preserving all the good parts.
Perhaps there are some easy wins, but JIT is not an easy thing. Depending on your needs, pushing math onto a port, or a nif is probably a quicker win than trying to make it fast in Erlang. However, I wonder if the single static assignment optimizer would offer a path towards recognizing 'straight line math code' and potentially running things much faster. But there's still an issue of potential mismatch between the very general number format with automatic bignum promotion and whatever the underlying machine provides.
I love this attitude. BEAM is something novel and special, and I think it's important to think of how to incrementally address its current shortcomings instead of throwing our hands up. I find GHC is another place where incrementalism on top of novelty is resulting in a lot of people's wishlists to be fulfilled.
Make it 10M and we can make it work for one version of OTP in a few years.
Make it 100M and recurring for 25 years and we can make it so it is in the ecosystem. This problem is hard and a lot of people have tried over the years. It always break down by not being able to deliver or noone wanting to maintain it.
I'm trying to learn Elixir and being a systems thinker so before I (can) get too comfortable I'm gonna want to dive into origin stories to build up my holistic map of why things are the way they are, what can be done and what can't be done, and understanding bottlenecks in the BEAM seems like it's gonna have to be part of that (the way I studied JVM tech documentation when I did perf and architecture work in Java)
From my experience, the gotchas tend to hit with emergent behavior, which is hard to benchmark, and may be repeatable in production, but is hard to model in a testing framework.
I'm not sure how much impact off-heap messaging has had, but the basic gotcha is that as a process gets bigger, it tends to run slower (because GC over more memory takes longer), and develop a larger message queue, which makes it slower. You need to have backpressure in your system, or small blips in procesing can blow up to huge messaging queues that can't be processed. Monitoring for overall queue size and maximum queue size is an important health indicator.
The other basic gotcha is that Erlang/OTP tends to default to 'unlimited' resource limits and 'infinity' time outs. You often want to have limits, and timeouts, but a general system doesn't know what you want. Sometimes, the unlimited settings result in terrible system behavior if you hit larger numbers than anyone else tested, but if you hit this, it's usually easy to fix.
A good thing about OTP is that they've written as much as possible of the environment in Erlang itself, so it's easier to change things when needed than a system where most of the provided apis are implemented in C.
The BEAM Book [1] is a good, though unfinished resource talking in general about the implementation - the memory model and the interpreter.
If you're interested in some very low-level details of the runtime, the internal documentation [2] also holds a lot of interesting details.
There are also some additional details on internals at Spawned Shelter [3].
[1]: https://blog.stenmans.org/theBeamBook/ [2]: https://github.com/erlang/otp/tree/master/erts/emulator/inte... [3]: http://spawnedshelter.com/#erlang-design-choices-and-beam-in...
I get mostly false positives trying to find those sorts of discussions or metrics.
However, that's not quite the same thing. The JVM allows you to change instructions, but not -data-. That is, in between versions you change what data a class contains, there is no way to change it out from the running instance. The JVM either has one version of the bytecode loaded, or the other; it has no concept of transitioning between them.
The BEAM has a mechanism to do that. It can have both loaded. And you can write transformation functions to allow the internal process state to transform from one to the other.
Per the article, "Hot code loading means that the application logic can be updated by changing the runnable code in the system whilst retaining the internal process state" - emphasis added. That's the key bit for maintaining uptime during an upgrade. Honestly, I don't think it's used that often, but it's there.
I want to make sure you're talking about the same thing.
To be clear: JVM enables the feature, so “technically” JVM allows hot code reload. Not sure how useful this is in practice for non-Clojure JVM users.
[1] https://docs.oracle.com/javase/7/docs/api/java/lang/ClassLoa...
There are limits to how much you can change in existing loaded class code, but if you are just loading up new dynamically generated code you can do pretty much anything.
Its a big reason why some of these Java frameworks are so fast. They can generate highly optimized code on the fly, load it, and have it running alongside the existing app code within a few hundred milliseconds. And the Java JIT will optimize it as if the code was there the whole time.
This makes performance optimizations easy that would be impossible in AOT languages like Go, C, C++, Rust, etc
As I understand it, it is feature complete and actually runs Erlang pretty well. Could be interesting to see some benchmark testing.