It will be a niche product with poor sales.
While the 4090 can run models that use less than 24GB of memory at blistering speeds, models are going to continue to scale up and 24GB is fairly limiting. Because LLM inference can take advantage of splitting the layers among multiple GPUs, high memory GPUs that aren't super expensive are desirable.
To share a personal perspective, I have a desktop with a 3090 and an M1 Max Studio with 64GB of memory. I use the M1 for local LLMs because I can use up to 57~GB of memory, even though the output (in terms of tok/s) is much slower than ones I can fit on a 3090.
I would gladly buy a card that ran a touch slower but had massive Vram, especially if it was affordable, but I guess that puts me into that camp of enthusiasts you mentioned.
>24GB is fairly limiting
Can I take a moment to suggest that maybe we're very spoiled?
24GB of VRAM is more than most peoples' system RAM, and that is "fairly limiting"?
To think Bill once said 640KB would be enough.
The fact is large language models require a lot of VRAM, and the more interesting ones need more than 24GB to run.
The people who are able to afford systems with more than 24GB VRAM will go buy hardware that gives them that, and when GPU vendors release products with insufficient VRAM they limit their market.
I mean inequality is definitely increasing at a worrying rate these days, but let's keep the discussion on topic...
But selling to machine learning enthusiasts is not a bad place to be. A lot of these enthusiasts are going to go on to work at places that are deploying enterprise AI at scale. Right now, almost all of their experience is CUDA and they're likely to recommend hardware they're familiar with. By making consumer Intel GPUs attractive to ML enthusiasts, Intel would make their enterprise GPUs much more interesting for enterprise.
It doesnt need to be consumer grade, it doesnt need to be ultra high either.
It needs to be cheap enough for my department to expensive it via petty cash.
It doesn't even matter if that's your primary goal or not.
Frustrated AMD customers willing to put their money where their mouth is?
>4090
These are noob hardware. A6000 is my choice.
Which really only further emphesizes your point.
>CPU based is a waste of everyone's time/effort
>GPU based is 100% limited by VRAM, and is what you are realistically going to use.
It's not like they don't have a monopoly on pre-installed OSes.
If Intel sells a stackable kit with a lot of RAM and a reasonable interconnect a lot of corporate customers will buy. It doesn't even have to be that good, just half way between PCIe 5.0 and NVLink.
But it seems they are still too stuck in their old ways. I wouldn't count on them waking up. Nor AMD. It's sad.