undefined | Better HN

0 pointsbehnamoh2y ago0 comments

I guess the argument is that most AI research is supported by the big tech, and they have heavily invested in the deep learning approach.

If the fundings were funneled to research groups working on alternative approaches, maybe we'd see the same amount of progress in AI only using another approach.

0 comments

kettleballroll2y ago

As a member of the research community: that's nonsense. Like already pointed out: academic groups (who by no means are dependent on big tech) would jump all over that. Mamba has been out long enough that you'd already see tons of papers at arxiv showing mamba dominating transformers in all sorts of applications. But that's not happening, despite the ton of hype. That doesn't mean that mamba is nonsense. Just that it isn't the immediate transformer killer. It remains to be seen if something comes from it, eventually.

godelski2y ago

As a member of the research community: that's nonsense. Publishing is an extremely noisy process in ML and is getting increasingly difficult for smaller non big tech collaborating labs. Reviewers' go to are: more datasets, scale, not novel. The easiest way to approach this is to work off of pretrained models. This is probably more obvious in the NLP world.

I agree that Mamba doesn't solve everything and it still needs work. But I disagree with the logic that there isn't an issue of railroading.

p1esk2y ago

What’s the main difference between an ape’s brain and a human brain? Scale. So that’s the train we’re riding at the moment. No roadblocks yet, aside from cost.

1 more reply

algo_trader2y ago

> Just that it isn't the immediate transformer killer.

What is the best/stable-ish linear alternative for transformer right now? Especially for text generation and summarization.

We have domain specific ways of over sampling and search, so we much prefer less expensive models.

anon2912y ago

As someone who's worked at several NVIDIA competitors, including Groq, I can guarantee you that, based on my knowledge, of existing products, they would be able to make much more money should they have lower memory footprint models. Given the amount of VC capital deployed for this (on the order of 100s of millions), I don't believe this is a reasonable take.

Sure, NVIDIA et al may not want that (although, again I don't see why... they too can't produce chips fast enough so being able to provide models for customers now ought to be good), but there's so much money out there that does...

landryraccoon2y ago

Why would Meta, Microsoft, Amazon and Google want Nvidia to remain dominant in hardware? Are you treating “big tech” like they all have one hive mind?

behnamohOP2y ago

For MSFT, AMZN, GOOG, the competitive advantage comes from having huge datasets (that Nvidia doesn't have). It's a symbiosis that benefits the data-rich and GPU-rich players.

pixl972y ago

This still makes little sense as that scale will always matter. If you can drop the compute cost of a model by 10x it means you can increase model integrity/intelligence/speed etc beyond what your compute bound competitors have.

Simply put, for the time being huge datasets are going to be needed and those with bigger (cleaner?) datasets will have a better behaving model.

fauigerzigerk2y ago

Where is the symbiosis? If data is the differentiator, how do the data owners benefit from Nvidia eating into their margins?

1 more reply

j / k navigate · click thread line to collapse

0 comments

kettleballroll2y ago

godelski2y ago

I agree that Mamba doesn't solve everything and it still needs work. But I disagree with the logic that there isn't an issue of railroading.

p1esk2y ago

What’s the main difference between an ape’s brain and a human brain? Scale. So that’s the train we’re riding at the moment. No roadblocks yet, aside from cost.

1 more reply

algo_trader2y ago

> Just that it isn't the immediate transformer killer.

What is the best/stable-ish linear alternative for transformer right now? Especially for text generation and summarization.

We have domain specific ways of over sampling and search, so we much prefer less expensive models.

anon2912y ago

landryraccoon2y ago

Why would Meta, Microsoft, Amazon and Google want Nvidia to remain dominant in hardware? Are you treating “big tech” like they all have one hive mind?

behnamohOP2y ago

For MSFT, AMZN, GOOG, the competitive advantage comes from having huge datasets (that Nvidia doesn't have). It's a symbiosis that benefits the data-rich and GPU-rich players.

pixl972y ago

Simply put, for the time being huge datasets are going to be needed and those with bigger (cleaner?) datasets will have a better behaving model.

fauigerzigerk2y ago

Where is the symbiosis? If data is the differentiator, how do the data owners benefit from Nvidia eating into their margins?

1 more reply

j / k navigate · click thread line to collapse