I'm did consider using clustering and having some master process coordinate everything, and using some shared-memory caching library. But it would not be "easy" to set up, especially compared to something like Java where you get thread pools and synchronized thread-safe collections out of the box.
And Lambda would have been totally impractical. As I said, I had hundred of gigs of data to process. If I'd been uploading this over my puny ADSL upstream every time, I'd still be waiting for a single run to complete.
I'm not trashing Node. I like it. There's a reason I used in the first place, after all. But for this particular use-case, I didn't find it was very good fit.