I think it is a tricky measurement problem for sure. In this case, it feels like there is an obvious measurement (individual outcomes) that you have to deliberately step around to see the better measurement. That deliberate decision to ignore individual success stories feels very callous.