undefined | Better HN

0 pointscogman101y ago0 comments

Yeah, what both companies would need to be competitive in the GPU sector is a cuda killer. That's perhaps the one benefit of merging Antel can more easily standardize something.

0 comments

23 comments · 6 top-level

quotemstr1y ago· 8 in thread

There are already packages that let people run CUDA programs unmodified on other GPUs: see https://news.ycombinator.com/item?id=40970560

For whatever reason, people just delete these tools from their minds, then claim Nvidia still has a monopoly on CUDA.

dzdt1y ago

And which of these have the level of support that would let a company put a multi-million dollar project on top of?

equestria1y ago

We have trillions of dollars riding on one-person open-source projects. This is not the barrier for "serious businesses" that it used to be.

2 more replies

stonogo1y ago

Those packages only really perform with low-precision work. For scientific computing, using anything but CUDA is a painful workflow. DOE has been deploying AMD and Intel alternatives in their leadership class machines and it's been a pretty bad speedbump.

smcin1y ago

('DOE' = US Department of Energy)

jcranmer1y ago

There's already a panoply of CUDA alternatives, and even several CUDA-to-non-Nvidia-GPU alternatives (which aren't supported by the hardware vendors and are in some sense riskier). To my knowledge (this isn't really my space), many of the higher-level frameworks already support these CUDA alternatives.

And yet still the popcorn gallery says "there no [realistic] alternative to CUDA." Methinks the real issue is that CUDA is the best software solution for Nvidia GPUs, and the alternative hardware vendors aren't seen as viable competitor for hardware reasons, and people attribute the failure to software failures.

jdewerd1y ago

> There's already a panoply of CUDA alternatives

Is there?

10 years ago, I burned about 6 months of project time slogging through AMD / OpenCL bugs before realizing that I was being an absolute idiot and that the green tax was far cheaper than the time I was wasting. If you asked AMD, they would tell you that OpenCL was ready for new applications and support was right around the corner for old applications. This was incorrect on both counts. Disastrously so, if you trusted them. I learned not to trust them. Over the years, they kept making the same false promises and failing to deliver, year after year, generation after generation of grad students and HPC experts, filling the industry with once-burned-twice-shy received wisdom.

When NVDA pumped and AMD didn't, presumably AMD could no longer deny the inadequacy of their offerings and launched an effort to fix their shit. Eventually I am sure it will bear fruit. But is their shit actually fixed? Keeping in mind that they have proven time and time and time and time again that they cannot be trusted to answer this question themselves?

80% margins won't last forever, but the trust deficit that needs to be crossed first shouldn't be understated.

1 more reply

quotemstr1y ago

> alternative hardware vendors aren't seen as viable competitor for hardware reasons, and people attribute the failure to software failures.

It certainly seems like there's a "nobody ever got fired for buying nvidia" dynamic going on. We've seen this mentality repeatedly in other areas of the industry: that's why the phrase is a snowclone.

Eventually, someone is going to use non-nvidia GPU accelerators and get a big enough cost or performance win that industry attitudes will change.

lmm1y ago

> There's already a panoply of CUDA alternatives, and even several CUDA-to-non-Nvidia-GPU alternatives (which aren't supported by the hardware vendors and are in some sense riskier). To my knowledge (this isn't really my space), many of the higher-level frameworks already support these CUDA alternatives.

On paper, yes. But how many of them actually work? Every couple of years AMD puts out a press release saying they're getting serious this time and will fully support their thing, and then a couple of people try it and it doesn't work (or maybe the basic hello world test works, but anything else is too buggy), and they give up.

physicsguy1y ago· 4 in thread

You don't get a CUDA killer without the software infrastructure.

Intel finally seem to have got their act together a bit with OneAPI but they've languished for years in this area.

gpapilion1y ago

They weren’t interested in creating an open solution. Both intel and AMD have been somewhat short sighted and looked to recreate their own cuda, and the mistrust of each other has prevented them from a solution for both of them.

pbalcer1y ago

Disclaimer: I work on this stuff for Intel

At least for Intel, that is just not true. Intel's DPC++ is as open as it gets. It implements a Khronos standard (SYCL), most of the development is happening in public on GitHub, it's permissively licensed, it has a viable backend infrastructure (with implementations for both CUDA and HIP). There's also now a UXL foundation with the goal of creating an "open standard accelerator software ecosystem".

3 more replies

arresin1y ago

What’s happening with intel wino? That seemed like their cuda ish effort.

wyldfire1y ago

OpenCL was born as a cuda-alike that could be apply to GPUs from AMD and NVIDIA, and general purpose CPUs. NVIDIA briefly embraced it (in order to woo Apple?) and then just about abandoned it to focus more on cuda. NVIDIA abandoning OpenCL meant that it just didn't thrive. Intel and AMD both embraced OpenCL. Though admittedly I don't know the more recent history of OpenCL.

roenxi1y ago· 2 in thread

This meme comes up from time to time but I'm not sure what the real evidence for it is or whether the people repeating it have that much experience actually trying to make compute work on AMD cards. Every time I've seen anyone try the problem isn't that the card lacks a library, but rather that calling the function that does what is needed causes a kernel panic. Very different issues - if CUDA allegedly "ran" on AMD cards that still wouldn't save them because the bugs would be too problematic.

cogman10OP1y ago

> Every time I've seen anyone try the problem isn't that the card lacks a library, but rather that calling the function that does what is needed causes a kernel panic.

Do you have experience with SYCL? My experience with OpenCL was that it's really a PITA to work with. The thing that CUDA makes nice is the direct and minimal exercise to start running GPGPU kernels. write the code, compile with nvcc, cudaed.

OpenCL had just a weird dance to perform to get a kernel running. Find the OpenCL device using a magic filesystem token. Ask the device politely if it wants to OpenCL. Send over the kernel string blob to compile. Run the kernel. A ton of ceremony and then you couldn't be guarenteed it'd work because the likes of AMD, Intel, or nVidia were all spotty on how well they'd support it.

SYCL seems promising but the ecosystem is a little intimidating. It does not seem (and I could be wrong here) that there is a defacto SYCL compiler. The goals of SYCL compilers are also fairly diverse.

roenxi1y ago

> Do you have experience with SYCL?

No, I bought a Nvidia card and just use CUDA.

> OpenCL had just a weird dance to perform to get a kernel running...

Yeah but that entire list, if you step back and think big picture, probably isn't the problem. Programmers have a predictable response to that sort of silliness. Build a library over it & abstract it away. The sheer number of frameworks out there is awe-inspiring.

I gave up on OpenCL on AMD cards. It wasn't the long complex process that got me, it was the unavoidable crashes along the way. I suspect that is a more significant issue than I realised at the time (when I assumed it was just me) because it goes a long way to explain AMD's pariah-like status in the machine learning world. The situation is more one-sided than can be explained by just a well-optimised library. I've personally seen more success implementing machine learning frameworks on AMD CPUs than on AMD's GPUs, and that is a remarkable thing. Although I assume in 2024 the state of the game has changed a lot from when I was investigating the situation actively.

I don't think CUDA is the problem here, math libraries are commodity software that give a relatively marginal edge. The lack of CUDA is probably a symptom of deeper hardware problems once people stray off an explicitly graphical workflow. If the hardware worked to spec I expect someone would just build a non-optimised CUDA clone and we'd all move on. But AMD did build a CUDA clone and it didn't work for me at least - and the buzz suggests something is still going wrong for AMD's GPGPU efforts.

1 more reply

izacus1y ago· 2 in thread

Why would you want this kind of increased monopolization? That is, CPU companies also owning the GPU market?

nemomarx1y ago

is it a lot more competitive for Nvidia to just keep winning? I feel like you want two roughly good choices for GPU compute and AMD needs a shot in the arm for that somewhere.

izacus1y ago

It is absolutely more competitive when nVidia is a separate company from Intel so they can't pull shit like "our GPUs only work with our GPUs" like Intel is now pulling with their WiFi chips.

seanmcdirmid1y ago· 1 in thread

Why doesn’t NVIDIA buy intel? They have the cash and they have the pairing (M chips being NVIDIA and intel’s biggest competitors now). It would be an AMD/ATI move, and maybe NVIDIA could do its own M CPU competitor with…whatever intel can help with.

jimbobbam1y ago

They don’t need it they have Grace

bionhoward1y ago

WGSL seems like a nice standard everyone could get behind

j / k navigate · click thread line to collapse

0 comments

23 comments · 6 top-level

quotemstr1y ago· 8 in thread

There are already packages that let people run CUDA programs unmodified on other GPUs: see https://news.ycombinator.com/item?id=40970560

For whatever reason, people just delete these tools from their minds, then claim Nvidia still has a monopoly on CUDA.

dzdt1y ago

And which of these have the level of support that would let a company put a multi-million dollar project on top of?

equestria1y ago

We have trillions of dollars riding on one-person open-source projects. This is not the barrier for "serious businesses" that it used to be.

2 more replies

stonogo1y ago

smcin1y ago

('DOE' = US Department of Energy)

jcranmer1y ago

jdewerd1y ago

> There's already a panoply of CUDA alternatives

Is there?

80% margins won't last forever, but the trust deficit that needs to be crossed first shouldn't be understated.

1 more reply

quotemstr1y ago

> alternative hardware vendors aren't seen as viable competitor for hardware reasons, and people attribute the failure to software failures.

Eventually, someone is going to use non-nvidia GPU accelerators and get a big enough cost or performance win that industry attitudes will change.

lmm1y ago

physicsguy1y ago· 4 in thread

You don't get a CUDA killer without the software infrastructure.

Intel finally seem to have got their act together a bit with OneAPI but they've languished for years in this area.

gpapilion1y ago

pbalcer1y ago

Disclaimer: I work on this stuff for Intel

3 more replies

arresin1y ago

What’s happening with intel wino? That seemed like their cuda ish effort.

wyldfire1y ago

roenxi1y ago· 2 in thread

cogman10OP1y ago

> Every time I've seen anyone try the problem isn't that the card lacks a library, but rather that calling the function that does what is needed causes a kernel panic.

roenxi1y ago

> Do you have experience with SYCL?

No, I bought a Nvidia card and just use CUDA.

> OpenCL had just a weird dance to perform to get a kernel running...

1 more reply

izacus1y ago· 2 in thread

Why would you want this kind of increased monopolization? That is, CPU companies also owning the GPU market?

nemomarx1y ago

is it a lot more competitive for Nvidia to just keep winning? I feel like you want two roughly good choices for GPU compute and AMD needs a shot in the arm for that somewhere.

izacus1y ago

It is absolutely more competitive when nVidia is a separate company from Intel so they can't pull shit like "our GPUs only work with our GPUs" like Intel is now pulling with their WiFi chips.

seanmcdirmid1y ago· 1 in thread

jimbobbam1y ago

They don’t need it they have Grace

bionhoward1y ago

WGSL seems like a nice standard everyone could get behind

j / k navigate · click thread line to collapse