"[...] and I really dislike that Elixir tries to hide immutability. That does make it slightly easier for beginners, but it’s a leaky abstraction. The immutability eventually bleeds through and then you have to think about it."
I don't think it necessarily tries to hide it (at all), but it does have some instances where something feels like a mutable structure. Those can be, at least for me, a bit confusing to reason about if you're expecting things both be and look immutable.
I suppose now that I know exactly what's weird, I should just go dig through the code and figure it out. Problem solved?
... One other thing, because I see this in the comments already, is that BEAM isn't the tool for every job-- but for some jobs, it is the only tool to do it well. Is the JVM faster at general tasks? Hell yes, but that's not the point, it's not even why BEAM is around.
It's about:
* Small concurrent workloads. Really long running CPU intensive tasks aren't going to be good.
* Low latency. Not just low, but with a very very small standard deviation. Your application's performance will be consistent.
* Fault tolerant.
The list goes on, and here's a nice summary of it (both bad and good):
http://blog.troutwine.us/2013/07/10/choose_erlang.html
There are times when I choose the JVM, there are times when I choose BEAM or MRI. I just try choose the right tool for the job, but some tools, make some jobs, very difficult.
cough ruby cough concurrency cough
Edit: One thing for people not familiar with BEAM, a "process" is not a Unix process, from the Elixir documentation:
"Processes in Elixir are extremely lightweight in terms of memory and CPU (unlike threads in many other programming languages). Because of this, it is not uncommon to have tens or even hundreds of thousands of processes running simultaneously."
Being able to open a remote console and do system introspection/tracing/profiling/debugging is a huge advantage when running in production. And all languages running on top of BEAM ofc get this for free.
In my experience, running JVM in production with tools like JProfiler/VisualVM/jconsole, etc. does not come close to the BEAM when trying to understand what is happening in the system.
Then you haven't tried Java Flight Recorder/Mission Control or the new javosize. BEAM doesn't come close... :)
Amen! Been doing that to the extent possible for a while and it is terrific!
So, it's not far-fetched. It will likely be a series of DSL's like the above or iMatix's model-driven development approach. These would specify it at a high level with precise requirements and constraints. Then, planning software with heuristics would produce the code. Similar systems for integration. Several people's worth of work or 10-20 tools becomes one person with one set of tools. Doubt we'll replace the person or need for some programming tools.
http://www.slideshare.net/BrianTroutwine1/erlang-lfe-elixir-...
> In many systems, Java included, the Garbage Collector (GC) must examine the entire heap in order to collect all the garbage. There are optimizations to this, like using Generations in a Generational GC, but those optimizations are still just optimizations for walking the entire heap. BEAM takes a different approach, leveraging the actor model on which it is based: If a process hasn’t been run, it doesn’t need to be collected. If a process has run, but ended before the next GC run, it doesn’t need to be collected
Well, how does BEAM know which process ran (so that its garbage should be collected)? Bookkeeping, of course, and that is also "just an optimization". Similarly, if a JVM object hasn't been touched since the last collection -- it doesn't need to be examined.
> If, in the end, the process does need to be collected, only that single process needs to be stopped while collection occurs
And new HotSpot GCs rarely stops threads at all for more than a few milliseconds (well, depending on the generation; it's complicated), collecting garbage concurrently with the running application, and other JVMs have GCs that never ever stop any thread for more than 20us (that's microseconds or so).
While BEAM's design helps it achieve good(ish) results while staying simple, the fact is that the effort that's gone into HotSpot gets it better results for even more general programs (collecting concurrent, shared data structures -- like ETS -- too).
I've said it before and I'll say it again: Erlang is a brilliant, top notch language, which deserves a top-notch VM, and the resources Erlang/BEAM currently have behind them are far too few for such a great language. Erlang's place is on the JVM. JVMs are used for many, many more soft-realtime (and hard-realtime) systems than BEAM, and yield much better performance.
An implementation of Erlang on the JVM (Erjang) done mostly by one person, was able to beat Erlang on BEAM in quite a few benchmarks, and that was without the new GCs, the new (or much improved) work-stealing scheduler and the new groundbreaking JIT (which works extremely well for dynamically-typed languages[1]).
OpenJDK could free Erlang programs from having to write performance-sensitive code in C (so many Erlang projects are actually mixed Erlang-C projects). While Erlang can be very proud of how much it's been able to achieve with so little, instead of fighting the JVM (or, rather, JVMs), it should embrace it. Everyone would benefit.
[1]: https://twitter.com/chrisgseaton/status/586527623163023362 , https://twitter.com/chrisgseaton/status/619885182104043520
Programming Erlang (authored by the creator of Erlang) says without any qualification at all that "Concurrent programs are made from small independent processes. Because of this, we can easily scale the system by increasing the number of processes and adding more CPUs."
When I read that I was expecting it to be followed by "ha ha... not really because of algorithmic sequential dependencies and Amdah's Law of course!" but it isn't!
You can have an infinite number of processes but if the dataflow graph they form doesn't have any parallelism then Erlang and BEAM aren't likely to be able to work any magic to make them so. Even if it did have parallelism it is only going to have so much and you certainly won't be able to arbitrarily scale it beyond that by increasing the number of processes.
What's more the typical advice about mutable shared state in Erlang is to encapsulate it safely in an process - which seems to be a recipe for further serialisation to me and so a crazy thing to promote!
Erlang's goal is to take problems that are embarrassingly parallel in theory and make them embarrassingly parallel in practice. Serving a billion independent http requests in a distributed, parallel manner can technically be done in Java or C or assembly. But, it's very hard to do well and very easy to screw up in painful, confusing, life-wasting ways. Erlang makes it much easier to do well and much harder to screw up.
Chapter 26 of Programming Erlang, 2nd Edition, Programming Multicore CPUs, quite explicitly notes the problem of avoiding sequential bottlenecks, and even devotes an entire exercise to parallelizing a sequential program.
The point is scaling. Think in terms of request rate. If you know you can have millions of processes per machine and they run well in parallel, then you can handle requests with processes and stop worrying.
I just want to clarify it's for concurrent not parallel.
Erlang doesn't promise parallel. You can get parallel from concurrent but not the other way around and once again Erlang only enable concurrency and you may get parallel beacuse of concurrency.
Data is immutable, so we don't have to worry about keeping data coherent between.. anything. Whether it is two processes or two nodes. New data is can be constructed with reference to old data without fear that the old data will be modified. So, "mutation" is really just new data with a reference to the old unchanged data. This greatly lowers the churn in creating new data. It also means everything can just pass (process to process or node to node) what it has without feature it will be out of date.
Everything is defined in modules. Modules define what we would think of in OOP as namespaces, structures, classes/types, and class functions. Importantly, they only define functionality. Modules do not have state. Therefore, functions accept some set of inputs, create new data from the inputs (no mutation), and return some output. This makes it very easy reason about what the code is doing if you keep the modules well defined and reasonably sized. This code can be shared around easily, too. It's got no state and is immutable.
Processes are an abstraction. You can think of them as thread, but they're really just a stack and and a little book keeping. A BEAM VM will normally of real threads equal to the number of CPUs in the machine. Each real thread will then exclusively pick a process, load the book keeping, point itself to the stack, and execute bytecode for a period of time. When done, it will mark the changes in the book keeping, and move to the next process. This is very lightweight, so literally millions can run on a single computer. Because they are self contained, they're easy to clean up. Processes also expose standard set of interfaces for communication, a pub/sub system. Again, immutable messages are sent back and forth. So, it doesn't matter if it's the same node or not.
Finally, everything is abstracted to the notion of nodes with in a cluster. By default, anything you executes on the local node, but you can specify otherwise. I can execute a module call on another machine or spawn a new process on another machine. It just means a little more information in the call, but it's the same exact concept programmatically. Also, it's possible group processes into named services. You can call a named service and it will know what processes to contact. It's a very low barrier to entry to parallelize your code if you just write it that way.
When you start thinking in terms of how structure you code for BEAM, you inherently get easy access to scalability.
The ease of scaling across machines, fault tolerance and low latency variation are more typical selling points. Besides that, god prevent erlang to become just-another-JVM-language, I embrace competition.
What does that have to do with the VM implementation?
> fault tolerance
True, that is a good selling point -- in theory. Indeed, BEAM's process isolation is better than the JVM's on paper. In practice, so many Erlang systems have so much C in them (because Erlang isn't fast enough for the data plane), that they can still bring down the entire VM (not as if there aren't other ways of doing that even without native code), or they interfere with one another in other ways because of BEAM's poor support for shared concurrent data structures.
> low latency variation
Nothing that can't be achieved on the JVM. Much of the low-latency Erlang enjoys is because relatively little data is kept on the Erlang heap anyway, and whatever significant amount of data is kept on the Erlang heap, it's in non-GCed ETS. If that's your way of achieving low latency variation, Erlang can do better on HotSpot.
> Besides that, god prevent erlang to become just-another-JVM-language, I embrace competition.
If your goal is not to have the best language environment you can but to show the world you have impressive results for the effort you've put in, then that's a whole other discussion.
And if all you want is competition, you can have Erlang on BEAM and the JVM. Why tie the language to one VM? Many JVM languages also compile to JavaScript, too (Clojure, Kotlin, Scala, Fantom and probably more)
Truth be told, most of the crowd using BEAM doesn't care if it's a bit slower than Java. They just want easy scaling, distribution, and fault-tolerance. A different code-base than Java's is a plus in terms of increasing implementation diversity and avoiding the bullseye currently on Java.
That bullseye exists only in the minds of some HNers. Here is a very (very!) partial list of companies running primarily or largely on the JVM: Google, Twitter, Netflix, LinkedIn, Box, IBM, SAP, Amazon, eBay.
> They just want easy scaling, distribution, and fault-tolerance.
... So they write chunks of their code in C. That would be completely unnecessary if they'd just run Erlang on the JVM.
I'm just learning Elixir (and therefore erlang/BEAM somewhat) and one thing that's cool to me is that a piece of code that's taking too long to execute can be paused by the VM while it switches to another thing, which keeps the latency down. I think, like, each process has some number of "ticks" or something before it switches away.
Can erlang on the JVM do that?
Edit: Also, the other thing that majorly attracts me to Elixir/Erlang is OTP (applications, genservers, supervision trees with restart strategies, etc). Are there any plans to port those libraries/philosophy into Quasar?
> Can erlang on the JVM do that?
Of course it can! Just like BEAM does it. (In fact, Quasar used to do that, too. We took out that feature because Quasar also gives you access to kernel threads, and processes that take to long can just be moved to kernel threads, which does this kind of preemption better, anyway. But an Erlang implementation on the JVM can behave just as Erlang does on BEAM).
8000$ per machine, though.
The G1 collector that will be made default in Java 9 might make some applications effectively pauseless too on some workloads.