But in practice I never saw anyone crack the embedding-generation-and-comparison problems well enough to actually get better results than grep for things like "find similar code and see what it does."
(You also don't need that advanced a model to use "grep over a pile of files", but the models today can run MUCH faster than GPT 3.5/4 were running over the APIs back then, making "summarize all five hundred of these matches from those files" much more usable.)