Yes! I'm imagining a value proposition where APL is as fast as C or Fortran, on distributed systems, because at that point it would probably evaporate most of Matlab, Mathematica, and probably TensorFlow's users.
Can you imagine? APL is an amazingly powerful language. It's easy to write nearly any mathematical function in it. What if it was faster than every other mathematical language? It would probably dominate the market overnight.
You mean Dyalog APL? It already regularly outperforms normal, handwritten C code. It also has support for distributed computing, multi-threading, and the Co-dfns compiler can be used to compile your APL code to the GPU. For example, consider the following talk, which discusses sub-nanosecond lookups/search using the Dyalog APL interpreter.
I don't think it will ever dominate the market - some things such as FFI remain weak points compared to C.
Plus a lot of what makes Mathematica good is that they have specific algorithms that they've optimized quite a bit, for things like computer algebra &c.
As mruts pointed out above, kdb+/q provides what you're imagining - it provides fantastic distributed computing support via its IPC protocol and also has multi-core support.
There is a free 64-bit version available with limits that don't seem harsh at all:
The 64-bit kdb+ On-Demand Personal Edition is free for personal, non-commercial use. Currently it may be used on up to 2 computers, and up to a maximum of 16 cores per computer, but is not licensed for use on any cloud – only personal computers. It requires an always-on internet connection to operate, and a license key file, obtainable from ondemand.kx.com
You can start the 32 bit instance with the -s switch (for q to be started with multiple slaves) for parallel execution of a function over data using the "peach" command [0].
But yes, I do agree that the 32 bit instances have limited use, as they can't be used in for-profit projects.