That said your description of "read and synthesize a 10k+ smorgasbord" is nowhere near descriptive enough to answer your question or have any kind of estimate on how easy/hard this work is. It could be anything from I'll have it done in a couple hours this afternoon to a year long effort by multiple devs.
The best I can say is in the real world it's pretty rare to have a huge task that doesn't have sensible subtasks. In the case of ingesting a large mess of data like you are describing this could mean identifying and ingesting the data from system X, or processing event Y. These are steps the developers is going to go through anyway, and in many cases are even going to be deployable and useful on their own! Even if they aren't individually deployable they represent useful landmarks allowing for the understanding how work is progressing, readjusting estimates as necessary, and determining if the entire design needs to be reconsidered due to violated assumptions.