You can definitely measure it. I've seen many times where people were struggling with problems for months with tons of operational incidents, before someone experienced came in and solved it for good in a couple of days.
So even if we believe in the hypothetical general purpose 10x programmer, a 120x one seems a little implausible, right? I think it must be a domain expert or something was really dysfunctional about the original team.