It's up to you if that counts as "a handful" or not.
Because that's the distinction being argued here: it's "a handful"[0] of probabilities, not the complete work.
[0] I'm not sold on the phrasing "a handful", but I don't care enough to argue terminology; the term "handful" feels like it's being used in a sorites paradox kind of way: https://en.wikipedia.org/wiki/Sorites_paradox
If we take math or computer science for example: some very important algorithms can be compressed to a few bits of information if you (or a model) have a thorough understanding of the surrounding theory to go with it. Would it not amount to IP infringement if a model regurgitates the relevant information from a patent application, even if it is represented by under a kilobyte of information?
I think this is all still compatible with saying that ingesting an entire book is still:
> If you're taking a handful of word probabilities from every book ever written, then the portion taken from each work is very, very low
(Though I wouldn't want to make a bet either way on "so courts aren't likely to care" that follows on from that quote: my not-legally-trained interpretation of the rules leads to me being confused about how traditional search engines aren't a copyright violation).
As my not-legally-trained interpretation of the rules leads to me being confused about how traditional search engines aren't a copyright violation, I don't trust my own beliefs about the law.