> the ability to spin up processes so lightweight that they make even Java threads look heavy in comparison - is the key to its success when it comes to parallel and distributed computing.
Except you can do the same in Java -- see Quasar or Erjang -- the JVM is so powerful that true lightweight threads -- just like Erlangs -- can be added as a library.
> I've found Erlang applications with BEAM to perform far better in that domain.
I've found the exact opposite. Java with Quasar fibers handily beat Erlang code. The more actual work being done, the bigger the difference (BEAM is excellent at scheduling, but pretty terrible at running user code; it's notoriously slow, which is why all important Erlang library functions are implemented in C, and why heavyweight Erlang shops do a lot of C coding; when you do Erlang, it's often Erlang and C if performance is important).
> whereas the JVM, .NET CLR, etc. were originally designed for imperative languages
So was your machine, yet BEAM runs on it just fine. You can like the imperative style or not, but it's more general than the functional one when it comes to implementations on real hardware. BEAM runs on an imperative, shared-state machine, and creates a nice abstraction. You can do the same on the JVM without loss of generality, and, as it turns out, with a nice boost to performance (because HotSpot's JIT and GCs are state-of-the-art). The only advantage BEAM has over the JVM (or at least HotSpot) is a better level of isolation between processes (i.e. it's a bit harder for one process to impact the performance of another, because they have sort-of separate heaps). But again, BEAM is a fine, beautiful runtime that is perfectly suitable if you need good concurrency but aren't worried about processing speed.
> Maybe because the equivalent Haskell codebase to a "big" project in most non-declarative languages ends up being "small" in comparison? Or are you talking about importance?
Well, both. Haskell has never been used to write a large ERP, airport management system, manufacturing automation, an air-traffic control system, device drivers, an OS, a database, a banking system or a large social network -- take your pick. It may have been used to a small extent for some specific projects in banking, but that's about it. The only large, complex, "interesting" from a cost estimation perspective ever written in Erlang is its own compiler (and possibly other compilers). Now, I may have missed one or two specific projects, but their rarity only demonstrates the problem. Go, a much younger languages, is already more battle-tested than Haskell and even Go is far from being truly battle tested, so this says a lot.