Right now I'm banking on the fact that lawyers copy and paste. This is a fair assumption as they have "clause libraries" at all major firms.
In the future I'd hope to tie in the NLP stuff we've been working on to do fuzzy matching.
We're highlighting text we've seen before, so it should be easy to see which lines are actually unique text.