They said the blinded portion is only skimming the application, and they get that info after a minute or two. If they blinded the entire pre-interview process (e.g., browse github profiles/etc through a chrome plugin that changes "nitin patel" to "grande puta"), that would be fairly convincing.
Ultimately a quality measurement at the end of the process would be the best, however - measure the outputs rather than the inputs.