story
Say that the scanner internally splits the scan into regions of 10x10 pixels that it saves in memory. If another region differs on less than (say) 10% of the pixels it is assumed that the two zones are identical and the first one is used in the second place too. The regions have no semantic meaning.
OCR translates the scan into a character set.