That seems too granular, as an additional 'uh' or 'and' would cause a sentence to be seen as different.
It might work well if you can first pair up the similar sentences from A and B, using word-level edit distance, or mapping to a lower-dimensional space using sentence embedding?