yeah I think the system is a combination, although I'm not sure the intricacies tbh
I've heard google does something fancier where they take a test, run it a bunch after it fails to check if it's failed
I think the system at work only runs each test a couple times before giving up and marking it failed