Memorization is storing data. Generalization is developing the heuristics by which you compress stored data. To distill knowledge is to apply heuristics to lossily-compress a large amount of data to a much smaller amount of data from which you nevertheless can recover enough information to be useful in the future.