Interesting. Do you know which version of Splunk you were using? Our latest version has vastly improved our query performance over large datasets.
Re. an open source log indexer: Agree, this is a space that will eventually become dominated by open source tools, used particularly by startups and small businesses. I think most people ignore this use of a MapReduce-like framework because they conceptually understand how it could be used, but 99% of all work is in the implementation, not the idea. And as of yet, I don't believe there has been a specific implementation beyond what companies like Shopify are doing where they add nice GUI tools on top of awk and grep (which admittedly is probably good enough for most people / business on this forum).