One of the founders of Antithesis gave a talk about this problem last week; diversity in test cases is definitely an issue they're trying to tackle. The example he gave was Spanner tests not filling its cache due to jittering near zero under random inputs. Not doing that appears to be a company goal.
https://github.com/papers-we-love/san-francisco/blob/master/...