In the name of completeness, there are also great packages like OpenMPI and OpenMP.
At least for my particular applications, though, there's either 1) a steep learning curve for programmers 2) language support issues 3) they're designed for batch processing.
Personally, I find shared memory interfaces the easiest to program when there're complicated data access patterns. But that just might be personal preference.