> The bet is that the effect size (if any) will be large enough to be informative despite the noise.
But you have no grounds to ascribe it to the posited difference. Finding no effect might yield more information, but that's hard: given the amount of noise, you're bound to find a great many effects.
> Have you seen this done?
Not in LLMs, but there have been experiments with regularizing languages, and getting people to learn them in Second Language Acquisition (L2) studies. But what I've seen is inconclusive and sometimes outright contradictory.
I think people have also looked via information theory at this. Probably using Markov models.
> Fedorenko's own comparison to "early LLMs" suggests she thinks the analogy has some merit.
I don't think she can seriously entertain that thought. We simply know practically nothing about language processes in the brain. What we know about the hardware is very different from LLMs, early or not.
Just to give an indication of how much we don't know: the Stroop effect (https://en.wikipedia.org/wiki/Stroop_effect) is almost 100 years old. We have no idea what causes it. There's no working model of word recognition. There are only vague suggestions about the origin of the delay. We have no clue how the visual signals for the color and the letters are separated, where they join again, and how that's related to linguistic knowledge. And that's almost 100 years of very, very much research. IF you go to Google Scholar and type "Stroop task", you'll get 197.000 (!) hits. That's nearly 200k articles etc. resulting in no knowledge whatsoever about a very simple, artificial task.