That being said, I think it would've made more sense for them to have created some dummy complex project for a class and have say 80% of the class introduce "good code", 10% of the class review all code and 10% of the class introduce these "hypocrite" commits. That way you could do similar research without having to potentially break legit code in use.
I say this since the crux of what they're trying to discover is:
1. In OSS anyone can commit.
2. Though people are incentivized to reject bad code, complexities of modern projects make 100% rejection of bad code unlikely, if not impossible.
3. Malicious actors can take advantage of (1) and (2) to introduce code that does both good and bad things such that an objective of theirs is met (presumably putting in a back-door).