undefined | Better HN

0 pointsloloquwowndueo2y ago0 comments

Why?

0 comments

16 comments · 2 top-level

baobabKoodaa2y ago· 14 in thread

With the current generation of text and image generators, 24GB is the sweet spot

My impression is a lot of the open source action is around the just-about-runs-in-12GB region - lots of models coming out with 7B/13B and 4-bit quantisation, a few 70B models (which won't fit in 24GB anyway) and only limited stuff in between.

I suppose I could be getting a biased impression though, as of course many more people are in a position to recommend the more accessible models.

What sort of things are you running that take full advantage of that 24GB?

eurekin2y ago

As the ancestor commenter mentioned:

> If you’re interested in ML training

Training - at least the one I tried - requires to be run in fp16 mode. So a 7b net needs 14 GB for the model weights alone, plus some extra for the context and the stuff I don't really understand (some gradient values, oh that makes sense now that I've written it)

2 more replies

paulmd2y ago

with both you and GP, I would imagine the answer is that people tend to build models to the hardware that is available. If 12GB and 24GB are the hardware thresholds that people have, you'll get "open-source action" in the 12GB and 24GB models, because people want to build things that run on the hardware they own.

(Which is of course how CUDA built its success more generally, vs the "you have to buy the $5k workstation card to get started" strategy from ROCm.)

More generally you'd call this optimization and targeting the hardware that's available. No sense releasing crysis when everyone is running a commodore 64, after all.

baobabKoodaa2y ago

I actually have a 12GB card, which I purchased specifically for AI (24GB cards are too expensive for me). You're correct that 12GB is also a sweet spot in terms of what you get per dollar spent.

Salgat2y ago

I wouldn't be shocked if the 5090 is also the only one with 24GB. Seems like NVidia is trying their hardest to suppress memory increases.

anonym292y ago

When you can sell an enterprise-grade card with 40-80GB of VRAM for $50k, selling consumer cards with 24GB for $2k is almost a form of charity, by comparison.

AMD and Intel GPUs do not have the software ecosystem for AI workloads that Nvidia does, though AMD is rapidly improving. Nvidia has had an effective monopoly on the AI hardware space for the last year or so, and continues to have an effective near-monopoly, but that won't last forever as AMD and Intel catch up.

The VRAM is one of the largest differentiators of their cards. Sufficient VRAM allows you to run huge LLMs like 65B in-memory, which is orders of magnitudes faster than system RAM + CPU. Smaller amounts of VRAM require swapping between VRAM and system RAM and incur a major performance penalty.

Businesses are fighting to fork over $50k+/card for 40/80GB cards with the same processor as the 24GB consumer cards - it doesn't make economic sense for Nvidia to offer more on the consumer cards, lest they start cannibalizing demand for the enterprise cards.

1 more reply

bbatha2y ago

Because the 5090 ostensibly targets gaming which only needs sufficient vram to display images on 4k textures on 4k, 5k and ultra-wide monitors. A large portion of the 5090 audience is not doing ML training and that vram would sit idle for a few monitor generations. As a gamer I would be kind of upset if they included that very expensive unneeded vram in their already very expensive cards.

1 more reply

sevagh2y ago

I've been hoping for a >24 GB ADA Titan consumer/gamer/retail card for a long time. The 4090 is awesome but I don't want the same VRAM as my 3090.

paulmd2y ago

I think it strongly depends on whether 24gbit GDDR7 is ready to go by that point or not. If so, they'll do 36GB.

two_in_one2y ago

5880 with 48GB has been announced. but I suspect it will be not in consumers range, more like $3k++.

1 more reply

nightski2y ago

Agreed but there is a lot more ML out there than just LLMs. You can't solve everything with prose.

eurekin2y ago

Attention mechanism, the core of LLM, is universal enough to be brought back to standard vision models. Which is kind of ironic, since vision models were dominated by convolutions, and, the transformer is dubbed "convolution for text".

The real reason is that it doesn't deteriorate with regards to the input length in case of text, or far neighbourhood in case of vision. It's just a universal, new, building block that allows for shallower neural networks to perform more like their bigger versions

garyfirestorm2y ago

many text generation models run on my 11G 1080Ti you can run quantized versions of these models if you aren't running it quantized, I'd say even 24 gig is not enough

baobabKoodaa2y ago

If you want to get the most bang for your buck, you definitely need to run quantized versions. Yes, there are models that run in 11G, just like there are models that run in 8G, and for any other amount of VRAM - my point is that 24G is the sweet spot.

eurekin2y ago

Since n-times a 3090 is still a much better offering

j / k navigate · click thread line to collapse

0 comments

16 comments · 2 top-level

baobabKoodaa2y ago· 14 in thread

With the current generation of text and image generators, 24GB is the sweet spot

michaelt2y ago

I suppose I could be getting a biased impression though, as of course many more people are in a position to recommend the more accessible models.

What sort of things are you running that take full advantage of that 24GB?

eurekin2y ago

As the ancestor commenter mentioned:

> If you’re interested in ML training

2 more replies

paulmd2y ago

(Which is of course how CUDA built its success more generally, vs the "you have to buy the $5k workstation card to get started" strategy from ROCm.)

More generally you'd call this optimization and targeting the hardware that's available. No sense releasing crysis when everyone is running a commodore 64, after all.

baobabKoodaa2y ago

I actually have a 12GB card, which I purchased specifically for AI (24GB cards are too expensive for me). You're correct that 12GB is also a sweet spot in terms of what you get per dollar spent.

Salgat2y ago

I wouldn't be shocked if the 5090 is also the only one with 24GB. Seems like NVidia is trying their hardest to suppress memory increases.

anonym292y ago

When you can sell an enterprise-grade card with 40-80GB of VRAM for $50k, selling consumer cards with 24GB for $2k is almost a form of charity, by comparison.

1 more reply

bbatha2y ago

1 more reply

sevagh2y ago

I've been hoping for a >24 GB ADA Titan consumer/gamer/retail card for a long time. The 4090 is awesome but I don't want the same VRAM as my 3090.

paulmd2y ago

I think it strongly depends on whether 24gbit GDDR7 is ready to go by that point or not. If so, they'll do 36GB.

two_in_one2y ago

5880 with 48GB has been announced. but I suspect it will be not in consumers range, more like $3k++.

1 more reply

nightski2y ago

Agreed but there is a lot more ML out there than just LLMs. You can't solve everything with prose.

eurekin2y ago

garyfirestorm2y ago

many text generation models run on my 11G 1080Ti you can run quantized versions of these models if you aren't running it quantized, I'd say even 24 gig is not enough

baobabKoodaa2y ago

eurekin2y ago

Since n-times a 3090 is still a much better offering

j / k navigate · click thread line to collapse