Dyalog APL/S-64 (17)
⍴+.×⌿?2 3000 3000⍴1e10
- size 3000: 6.6s, 0.5GiB RSS
- size 10000: 101s, 5.4GiB RSS
Conclusions:J's implementation is still the fastest at this particular task (matrix multiplication of huge matrices on a single CPU thread--granted, not the most significant of benchmarks.) Dyalog APL comes close behind. GNU APL and NumPy lag much more behind that.