More fucking morons. The gap with biomedical research isn't in the realm of language models, but in the amount of information that exists in biology that we don't know. I'm not sure what percentage of all the genetic data on Earth we've sequenced, but it's not much, and we still don't
quite have a mechanical understand of a single cell, much less some complex multicellular organisms with proteins affecting gene expression, cell membrane receptors being reused in 50 different tissue types, molecular secretion and diffusion altering our minds functioning, and electrical currents synchronizing brain firing at a distance.
No LLM trained on PubMed will be able to suss this all out - more data is needed.
Even in pure mathematics, where I am currently a grad student and as needed a big fan of trying to get LLMs to explain stuff to me at 1 am, they just aren't that good. If it's a popular question where I could have tried math overflow, sure, it's probably just going to get some details weirdly wrong, but for subtle complex concepts, it's not making some golden age of truth and understanding.
And God help the LLMs trying to understand physics that are trained on all the BS on Youtube and the blogs.