Also GNU utils
http://aadrake.com/command-line-tools-can-be-235x-faster-tha...
The above experiment (which has an interesting github repo) is somewhat over (and real world unusable), but still is eye opening. Hadoop and Spark bring so much complexity that looking for simpler solutions is something worth considering.