Wikidata is several magnitudes smaller than Freebase (closed by Google in May) and it won't fit in your RAM (laptop).
Said that, I believe the process described in the blog post is not loading the whole Wikidata dump into memory and it would work the same to process Freebase or even larger data dumps with your laptop.
From the post: How Akka Streams can be used to process the Wikidata dump in parallel and using constant memory with just your laptop.
[1] https://developers.google.com/freebase/data http://dumps.wikimedia.org/other/wikidata/
http://www.quora.com/Where-can-I-find-large-datasets-open-to...
[1] Using parallel bulk indexer for ES: https://github.com/miku/esbulk
https://github.com/andrewvc/wikiparse/tree/java
That being said, when it comes to indexing wikipedia, the indexing can be done well across multiple threads internally by elasticsearch. Multithreading the reading/parsing isn't a huge win. Doing decompression in a separate thread is however.
With SOLR, it's similar:
> Sometimes you need to index a bunch of documents really, really fast. [...] The solution is two-fold: batching and multi-threading
From: http://lucidworks.com/blog/high-throughput-indexing-in-solr/
It's a bit disturbing to see an employee presenting her personal life, kids, interests, and what not. Good job, IntentHQ!
The video: https://www.intenthq.com/resources/interest-fingerprint/
I'm sure their products are of the highest quality, but their blog isn't a great advert in my opinion.
[1] http://engineering.intenthq.com/2015/06/for-those-about-to-c...
[2] http://engineering.intenthq.com/2015/06/wikidata-akka-stream...
Seems so to me after skimming the article, but maybe I missed an important advantage of using Akka Streams for this task?
I think the intent was for this to be more of a demonstrative example, and with a more complex, evolving, real-world processing pipeline, Akka streams could be really useful.