If the fundings were funneled to research groups working on alternative approaches, maybe we'd see the same amount of progress in AI only using another approach.
I agree that Mamba doesn't solve everything and it still needs work. But I disagree with the logic that there isn't an issue of railroading.
What is the best/stable-ish linear alternative for transformer right now? Especially for text generation and summarization.
We have domain specific ways of over sampling and search, so we much prefer less expensive models.
Sure, NVIDIA et al may not want that (although, again I don't see why... they too can't produce chips fast enough so being able to provide models for customers now ought to be good), but there's so much money out there that does...
Simply put, for the time being huge datasets are going to be needed and those with bigger (cleaner?) datasets will have a better behaving model.