So, this issue isn't about sites that Google can't crawl in totality, it's about sites where they discard pages that they have crawled. If a site has less than [large number] of pages, there would be no need to worry about it; they could just index them all. But it's not like their indexing algorithm is operating naively either—for sites with a lot of pages, there's plenty of analysis they can do to determine things like whether the pages contain coherent text and other such things, to determine whether the information is worth indexing.
In the case described here though, these pages were actually indexed at one point; Google just decided that once they reached a given age, they were no longer necessary to remember. They could have simply decided to keep them instead.