If you were hiring a human chauffeur would you insist they be better than your current driver in
all conditions (rain, night-time, snow, off-road)? Or would you more likely ask that they be minimally competent in all likely conditions, and that his driving style in aggregate, for your current driving profile, be an improvement?
Insisting on not replacing a driver until every subcategory of driving type is strictly improved seems too stringent.
You still have an out here by saying that you're willing to accept more net risk due to "meta" factors like "driver not meeting minimal competence factors". But it would still be more honest in that case to fully admit that this reasoning requires knowingly accepting likely real-world increased fatalities.
And as an aside, for humans, "minimal competence" testing is reasonable - but at large scales & in statistical realms, such metrics value in protection from false claims are of less value since we already have millions of miles of real-world performance tests.