Intel definitely seems to be doing all the right things on software support.
This is a huge problem because in theory the Arc A770 is faster! It's theoretical performance (TFLOPS) is more than twice as fast as an Nvidia 4060 (see: https://cdn.mos.cms.futurecdn.net/Q7WgNxqfgyjCJ5kk8apUQE-120... ). So why does it perform so poorly? Because everything AI-related has been developed and optimized to run on Nvidia's CUDA.
Mostly, this is a mindshare issue. If Intel offered a workstation GPU (i.e. not a ridiculously expensive "enterprise" monster) that developers could use that had something like 32GB or 64GB of VRAM it would sell! They'd sell zillions of them! In fact, I'd wager that they'd be so popular it'd be hard for consumers to even get their hands on one because it would sell out everywhere.
It doesn't even need to be the fastest card. It just needs to offer more VRAM than the competition. Right now, if you want to do things like training or video generation the lack of VRAM is a bigger bottleneck than the speed of the GPU. How does Intel not see this‽ They have the power to step up and take over a huge section of the market but instead they're just copying (poorly) what everyone else is doing.
Intel, screw everything else, just pack as much VRAM in those as you can. Build it and they will come.
It will be a niche product with poor sales.
While the 4090 can run models that use less than 24GB of memory at blistering speeds, models are going to continue to scale up and 24GB is fairly limiting. Because LLM inference can take advantage of splitting the layers among multiple GPUs, high memory GPUs that aren't super expensive are desirable.
To share a personal perspective, I have a desktop with a 3090 and an M1 Max Studio with 64GB of memory. I use the M1 for local LLMs because I can use up to 57~GB of memory, even though the output (in terms of tok/s) is much slower than ones I can fit on a 3090.
But selling to machine learning enthusiasts is not a bad place to be. A lot of these enthusiasts are going to go on to work at places that are deploying enterprise AI at scale. Right now, almost all of their experience is CUDA and they're likely to recommend hardware they're familiar with. By making consumer Intel GPUs attractive to ML enthusiasts, Intel would make their enterprise GPUs much more interesting for enterprise.
It doesn't even matter if that's your primary goal or not.
Frustrated AMD customers willing to put their money where their mouth is?
>4090
These are noob hardware. A6000 is my choice.
Which really only further emphesizes your point.
>CPU based is a waste of everyone's time/effort
>GPU based is 100% limited by VRAM, and is what you are realistically going to use.
If Intel sells a stackable kit with a lot of RAM and a reasonable interconnect a lot of corporate customers will buy. It doesn't even have to be that good, just half way between PCIe 5.0 and NVLink.
But it seems they are still too stuck in their old ways. I wouldn't count on them waking up. Nor AMD. It's sad.
Which I don't care too much about.
However, even 16->24GB is a big step, since a lot of the model are developed for 3090/4090-class hardware. 36GB would place it lose to the class of the fancy 40GB data center cards.
If Intel decided to push VRAM, it will definitely have a market. Critically, a lot of folks will also be incentivized to make software compatible, since it will be the cheapest way to run models.
I want a consumer card that can do some number of tokens per second. I do not need a monster that can serve as the basis for a startup.
I heard some Asrock motherboard BIOSes could set the VRAM up to 64GB on Ryzen5.
Doing some investigations with different AMD hardware atm.
Ryzen5 has both CPU+GPU on one chip, the BIOS allows you set the amount of VRAM. They share the same RAM bank, you can set 16GB of VRAM and 16GB for the OS if you use a 32GB RAM bank.
The serious crypto and AI nuts are all using custom hardware. Crypto moved onto ASICs for anything power-efficient, and Nvidia's DGX systems aren't being cannibalized from the gaming market.
Seems like we just need consumer matrix math cards with literally no video out, and then a different set of requirements for those with a video out.
But then those pesky researchers and hackers figured out how to use the matmul hardware for non-gaming.
Right now, the best discriminator they have is that PC users are willing to put up with much smaller amounts of VRAM.
Can you elaborate on this? Intel's reputation for software support hasn't been stellar, what's changed?
The same thing would make a lot of sense here. Super-fast memory close, with overflow into classic DDR slots.
As a footnote, going parallel also helps. 8 sticks of RAM at 1/8 the bandwidth each is the same as one stick of RAM at 8x the bandwidth, if you don't multiplex onto the same traces.
You can pick them up in prebuilds from Dell and Supermicro: https://www.supermicro.com/en/accelerators/intel
Read more about them here: https://www.servethehome.com/intel-shows-gpu-max-1550-perfor...
- SYCL [1]
- Vulkan
- OpenCL
I don't own the hardware, but I imagine SYCL is more performant for ARC , because it's the one intel is pushing for their datacenter stuff
[1]: https://www.intel.com/content/www/us/en/developer/articles/t...
16GB RAM and performance around a 4060ti or so, but for 65% of the price
For all their hardware research hiccups in the last 10 years, they've been delivering on open source machine learning libraries.
It's apparently the same on driver improvements and gaming GPU features in the last year.