AMD's Radeon 890M: Strix Point's Bigger iGPU (opens in new tab)

(chipsandcheese.com)

158 pointsluyu_wu1y ago69 comments

69 comments

41 comments · 8 top-level

rishav_sharan1y ago· 10 in thread

I love how well Intel's Arc iGPU and AMDs Strix Point iGPU are doing. I am planning to get an iGPU laptop with 64 Gb RAM. I plan on using local llms and image generators and hopefully with that large of shared RAM that shouldn't be too much of a problem. But I am worried that all LLM tools today are pretty much NVidia specific, and I wouldn't be able to get my local setup going.

replete1y ago

I've noticed some BIOS' do not allow the full capacity of unified memory to be allocated, so if you do this check you can actually allocate 16GB, some are limited to 2 or 4GB, seemingly unnecessarily

ComputerGuru1y ago

Apparently this is a legacy holdover and you should choose the smallest size in the bios. Fully unified memory is the norm, you don’t need to do the memory splitting that way.

aappleby1y ago

You'll be limited by memory bandwidth more than compute.

imtringued1y ago

Anyone who uses a CPU for inference is severely compute constrained. Nobody cares about tokens per second the moment inference is faster than you can read, but staring down a blank screen for 5 minutes? Yikes.

3 more replies

dagmx1y ago

Can they access the full RAM? Afaik they get capped to a portion of total available RAM.

But to your other point, very little of the current popular ML stack does more than CUDA and MPS. Some will do rocm but I don’t know if the AMD iGPUs are guaranteed to support it? There’s not much for Intel GPUs.

hedgehog1y ago

It depends on the API used, whether the data is in the region considered "GPU memory" or whether it's shared with the compute API from the app's memory space. Support is somewhat in flux and I haven't been following closely but if you're curious this is my bookmarked jumping of point (a PyTorch ticket about this):

https://github.com/pytorch/pytorch/issues/107605

2 more replies

sillystuff1y ago

> Some will do rocm but I don’t know if the AMD iGPUs are guaranteed to support it?

If you only care about inference, llama.cpp supports Vulkan on any iGPU with Vulkan drivers. On my laptop with crap bios that does not allow changing any video ram settings, reserved "vram" is 2GB, but llama.cpp-vulkan can access 16GB of "vram" (half of physical ram). 16GB vram is sufficient to run any model that has even remotely practical execution speed on my bottom-of-the-line ryzen 3 3250U (Picasso/Raven 2); you can always offload some layers to CPU to run even larger.

(on Debian stable) Vulkan support:

  apt install libvulkan1 mesa-vulkan-drivers vulkan-tools

Build deps for llama.cpp:

  apt install libshaderc-dev glslang-dev libvulkan-dev

Build llama.cpp with vulkan back-end:

  make clean (I added this, in case you previously built with a diff back-end)

  make LLAMA_VULKAN=1

If more than one GPU: When running, you have to set GGML_VK_VISIBLE_DEVICES to the indices of the devices you want e.g.,

  export GGML_VK_VISIBLE_DEVICES=0,1,2

The indices correspond to the device order in

  vulkaninfo --summary.

By default llama.cpp will only use the first device it finds.

llama.cpp-vulkan has worked really well, for me. But, per benchmarks from back when Vulkan support was first released, using the CUDA back-end was faster than the Vulkan back-end on NVIDIA GPUs. Probably same Rocm vs Vulkan on AMD too. But, zero non-free / binary blobs required for Vulkan, and Vulkan supports more devices (e.g., my iGPU is not supported by Rocm)-- haven't tried, but you can probably mix GPUs from diff manufacturers using Vulkan.

guilamu1y ago

Be careful, most bios will let you use only 1/4 of the total ram for the integrated GPU. Some - really bad - bios are even limiting to 2gb totally ignoring how much ram is available.

allen_fisher1y ago

I set up both stable diffusion and LLMs on my desktop without Nvidia GPU. Everything goes well. Stable diffusion can run on onnx backend on my AMD GPU, and LLMs run through gguf format through ollama on CPU, model scale and speed are limited though.

jodleif1y ago

The problem here is the slow memory… the iGPU is really already limited by slow ram, and with LLMs memory bandwidth is king

Luker881y ago· 7 in thread

I basically only buy AMD, but I want to point out how rocm still doesn't fully support the 780M.

I have a laptop with a 680M and a mini pc with a 780M both beefy enough to play around with small LLM. You basically have to force the gpu detection to an older version, and I get tons of gpu resets on both.

AMD your hardware is good please give the software more love.

dannyw1y ago

AMD doesn't realise the wide penetration and availability of CUDA is what makes the ecosystem so strong. Developers can develop and test on their personal devices which are prevalent, and that's what creates such a big software ecosystem for the expensive chips.

When I raised this feedback with our AMD Rep, they said it was intentional and that consumer GPUs are primarily meant for gaming. Absolutely shortsighted.

brookst1y ago

I can forgive AMD for not seeing how important CUDA was ten years ago. Nvidia was both smart and lucky.

But failing to see it five years ago is inexcusable. Missing it two years ago is insane. And still failing to treat ML as an existential threat is, IDK, I’ve got no words.

1 more reply

carlmr1y ago

It's either strategic incompetence, technical incompetence, or both at this point.

3 more replies

dboreham1y ago

Obviously anything that's known on this thread is known to AMD management, or at least their assistants.

hedora1y ago

I recently tried to setup Linux on a few machines with nvidia and AMD GPUs, and, while AMD could improve, they're way ahead of nvidia on all fronts except machine learning.

Nvidia's drivers are still uniformly garbage (as they have been for the last 20 years) across the board, but they do work sometimes, and I guess they're better for machine learning. I have a pile of "supported" nvidia cards that can't run most opengl / glx software, even after installing dkms, recompiling the planet, etc, etc, etc.

Since AMD upstreamed their stuff into the kernel, everything just works out of the box, but you're stuck with rocm.

So, for all use cases except machine learning, AMD's software blows Nvidia's out of the water for me. This includes running Windows games, which works better under Linux than Windows (the last time I checked), thanks to Steam.

On my 780m, I installed current devuan (~= debian) stable, and had a few xscreensaver crashes and reboots. I checked dmesg, and it had clear errors about irq state machines being wrong for some of the radeon stuff. So, even when running future hardware, their error logs are great.

After enabling backports and upgrading the kernel, the dmesg errors went away, and it's a 100% uptime machine.

The remaining hardware problem is that pulseaudio is still terrible after all these years, so I have to repeatedly switch audio out to hdmi.

colordrops1y ago

Use pipewire instead of pulseaudio. Much better.

1 more reply

stuaxo1y ago

Having had AMS Ryzen laptops for the last 6 plus years, so much this.

Right now I'm messing around trying to get pytorch vulkcan support compiling just so I avoid switching to ROCM.

torrance1y ago· 6 in thread

These results are promising and hopefully carry over to the upcoming Strix Halo which I’m eagerly awaiting. With a rumoured 40 compute cores and performance on par with a low power (<95W) mobile RTX4070, it would make an exciting small form gaming box.

jauntywundrkind1y ago

I've been super excited for Strix Halo, but I'm also nervous. Strix Halo is a multi-chip design, and I'm pretty nervous about whether AMD can pull it off in a mobile form factor, while still being a good mobile chip.

Strix Point can be brought down to 15W and still do awesome. And go up to 55W+ and be fine. Nice idles. But it's monolithic, and I'm not sure if AMD & TSMC are really making that power penalty of multichip go down enough.

luyu_wuOP1y ago

Very valid concerns! AMD's current die-to-die interconnects have some pretty abysmal energy/bit. Really hope they can pull off something similar to Intel's EMIB maybe?

1 more reply

_cenw1y ago

The 7945HX3D needs 55W minimum, if that's any indicator.

KingOfCoders1y ago

I hope Strix Halo gets a desktop motherboard (no socket :-( for the memory bandwidth for faster compiles (Go). That or a 9950X3D (like the 7950X3D).

naoru1y ago

Me too. There's at least one manufacturer who makes pretty sweet mini-ITX motherboard with R9 7945HX, I hope they will follow up with Strix Halo once it's released.

layer81y ago

That kind of performance will still require significant cooling, which if you want it to be quiet is helped by a larger box.

Shorel1y ago· 3 in thread

Similar performance to Nvidia 1080 dedicated GPU.

Would I get it? Absolutely yes. A full desktop small form factor is a very convenient, nice thing.

dagmx1y ago

Where do you see a performance comparison for the 1080?

The only mention of NVIDIA in the post is of the 1050 which is a considerable step away from a 1080.

> It also moves ahead of Nvidia’s Pascal based GTX 1050 3 GB

lhl1y ago

From Notebookcheck's benchmarks it looks like the Radeon 890M is punching at about a GeForce 1650 Mobile's performance: https://www.notebookcheck.net/Computer-Games-on-Laptop-Graph...

Based on https://www.techpowerup.com/gpu-specs/geforce-gtx-1650-mobil... this is about 40% faster than a GTX 1050, but also almost half the speed of a GTX 1080.

1 more reply

Shorel1y ago

Sorry, I forgot to add it was the mobile version of the 1080, which is indeed slower than full form factor.

bornfreddy1y ago· 2 in thread

Interesting:

> With Strix Point, AMD’s mobile iGPU has a newer graphics architecture than its desktop counterparts. It’s an unprecedented situation, but not a surprising one. Since the DX11 era, AMD has never been able to take and hold the top spot in the discrete GPU market. Nvidia has been building giant chips where cost is no object for a long time, and they’re good at it. Perhaps AMD sees lower power gaming as a market segment where they can really excel. Strix Point seems to be a reflection of that.

Did AMD figure out that this market segment is underserved by NVidia? If so, good for them, laptops could use better GPUs.

dagmx1y ago

I doubt Strix Point is gunning for NVIDIA.

It’s more than likely this is just a stronger play to get ahead of Intel in market share.

That’s a much more tangible competitor in that space.

Whether it means more games optimize for AMD as a side effect is tangential at best. Otherwise there’s no real reason to treat this as competing with NVIDIA. It’s an integrated GPU so it’s not moving any extra units.

mmaniac1y ago

Nvidia can't really enter this segment unless Windows on ARM takes off, and they don't want to be the one to put the first foot forward.

If Snapdragon X Elite is a success, you can bet Nvidia will be producing laptop SoCs with passable CPUs and great iGPUs.

aurareturn1y ago· 2 in thread

Some comparisons:

4k Aztec High GFX

* AMD 890M: 39.1fps

* M3: 51.8fps

3DMark Wild Life Extreme

* AMD 890M: 7623

* M3: 8286

Power:

* AMD 890M: 46w

* M3: 8286: 17w

M3 about ~253% more efficient.

But of course, if your goal is gaming, AMD's GPU will still be better because of Vulkan, DirectX, and Windows support. In pure architecture, AMD is quite a bit behind Apple.

adrian_b1y ago

The "170%" number is bogus.

Reducing the power of 890M to 17 W, the same as quoted for M3, would reduce the performance much less than the reduction in power consumption, improving the energy efficiency.

For a valid comparison of the energy efficiency, both systems must be configured for the same power consumption.

Moreover, by themselves those performance values do not prove that AMD is behind Apple in GPU architecture.

The better performance of the Apple GPU could be entirely caused by the much higher memory bandwidth and by the better CMOS process used for the Apple GPU.

For any conclusions about architecture, much more detailed tests would be needed, to separate the effects of the other differences that exist between these systems.

aurareturn1y ago

>The "170%" number is bogus.

Actually, it's 253%. I made a mistake assuming the 890M was limited to 35w. It was actually 46w as measured by Notebookcheck.[0]

>Reducing the power of 890M to 17 W, the same as quoted for M3, would reduce the performance much less than the reduction in power consumption, improving the energy efficiency.

That depends. Sure, give almost any chip less power and it will be more efficient. I'm not arguing against that.

The problem with reducing power for the 890M is that it's already slower than the M3 by 26% while using 2.7x the power.

If you give the 890M 17w, yes, it will be more efficient than 46w. It just just be even slower than the M3.

>The better performance of the Apple GPU could be entirely caused by the much higher memory bandwidth

M3's bandwidth is 102.4 GB/s. AMD Strix Point uses LPDDR5X-7500 in dual channel mode so it should be around 120GB/s.

>and by the better CMOS process used for the Apple GPU.

AMD's Strix Point is manufactured on TSMC's N4P. M3 is on N3B, which is roughly 10% more power efficient than N4P. It doesn't explain the huge discrepancy in efficiency.

[0]https://www.notebookcheck.net/AMD-Zen-5-Strix-Point-iGPU-ana...

setgree1y ago· 2 in thread

So what was AMD thinking with its release of the 8700G and 8600G APUs, and is it planning to phase them out?

They come with the 780M and 680M processors, respectively, and both are outperformed by the 980M at a lower power draw [0]. Theoretically a consumer can't put these parts directly in a pc there's already a mini-pc with the laptop part 980M [1]. The 7800G sometimes shows up in mid-range and high-end gaming PCs with discrete graphics cards [2], which makes so little sense that I wonder if AMD quietly offloaded them in bulk at a steep discount to vendors.

I've commented on this before [3], can anyone shed light on the situation?

[0] https://www.anandtech.com/show/21485/the-amd-ryzen-ai-hx-370...

[1] https://www.tomshardware.com/desktops/mini-pcs/soyos-upcomin...

[2] https://www.tomshardware.com/desktops/gaming-pcs/hp-omen-35l...

[3] https://news.ycombinator.com/item?id=41140287

luyu_wuOP1y ago

APUs have their own small niche in the DIY market! From what I understand, some people build their own computer without a dGPU first, then later purchase a GPU and prefer to be able to change the CPU at that point as well? Hopefully this rationale is what you're asking for!

setgree1y ago

In theory this makes sense, but reviews all suggested that the price proposition for the APUs didn't really work and now AMD has released a laptop part that fills the same niche. DIY folks won't be able to get their hands on the laptop part right away but presumably soon? It just seems kind of odd to release something that outperforms your own product in the same segment.

1 more reply

mastax1y ago· 1 in thread

How fortunate for Intel that as soon as they ruin their CPU naming scheme, AMD follow suit.

mmaniac1y ago

AMD's mobile CPUs have had confusing names since 2022 with the 7000 series, and are now completely bonkers. The completely unnecessary insertion of "AI" into every name, the TDP suffix now an optional prefix, and the extremely poorly justified generation counter starting at 300.

Intel on the other hand were fairly sensible. i7 becomes Ultra 7 and the numbering restarts from 100 (Meteor Lake). That's easy to follow.

j / k navigate · click thread line to collapse

69 comments

41 comments · 8 top-level

rishav_sharan1y ago· 10 in thread

replete1y ago

I've noticed some BIOS' do not allow the full capacity of unified memory to be allocated, so if you do this check you can actually allocate 16GB, some are limited to 2 or 4GB, seemingly unnecessarily

ComputerGuru1y ago

Apparently this is a legacy holdover and you should choose the smallest size in the bios. Fully unified memory is the norm, you don’t need to do the memory splitting that way.

aappleby1y ago

You'll be limited by memory bandwidth more than compute.

imtringued1y ago

3 more replies

dagmx1y ago

Can they access the full RAM? Afaik they get capped to a portion of total available RAM.

hedgehog1y ago

https://github.com/pytorch/pytorch/issues/107605

2 more replies

sillystuff1y ago

> Some will do rocm but I don’t know if the AMD iGPUs are guaranteed to support it?

(on Debian stable) Vulkan support:

  apt install libvulkan1 mesa-vulkan-drivers vulkan-tools

Build deps for llama.cpp:

  apt install libshaderc-dev glslang-dev libvulkan-dev

Build llama.cpp with vulkan back-end:

  make clean (I added this, in case you previously built with a diff back-end)

  make LLAMA_VULKAN=1

If more than one GPU: When running, you have to set GGML_VK_VISIBLE_DEVICES to the indices of the devices you want e.g.,

  export GGML_VK_VISIBLE_DEVICES=0,1,2

The indices correspond to the device order in

  vulkaninfo --summary.

By default llama.cpp will only use the first device it finds.

guilamu1y ago

Be careful, most bios will let you use only 1/4 of the total ram for the integrated GPU. Some - really bad - bios are even limiting to 2gb totally ignoring how much ram is available.

allen_fisher1y ago

jodleif1y ago

The problem here is the slow memory… the iGPU is really already limited by slow ram, and with LLMs memory bandwidth is king

Luker881y ago· 7 in thread

I basically only buy AMD, but I want to point out how rocm still doesn't fully support the 780M.

AMD your hardware is good please give the software more love.

dannyw1y ago

When I raised this feedback with our AMD Rep, they said it was intentional and that consumer GPUs are primarily meant for gaming. Absolutely shortsighted.

brookst1y ago

I can forgive AMD for not seeing how important CUDA was ten years ago. Nvidia was both smart and lucky.

But failing to see it five years ago is inexcusable. Missing it two years ago is insane. And still failing to treat ML as an existential threat is, IDK, I’ve got no words.

1 more reply

carlmr1y ago

It's either strategic incompetence, technical incompetence, or both at this point.

3 more replies

dboreham1y ago

Obviously anything that's known on this thread is known to AMD management, or at least their assistants.

hedora1y ago

I recently tried to setup Linux on a few machines with nvidia and AMD GPUs, and, while AMD could improve, they're way ahead of nvidia on all fronts except machine learning.

Since AMD upstreamed their stuff into the kernel, everything just works out of the box, but you're stuck with rocm.

After enabling backports and upgrading the kernel, the dmesg errors went away, and it's a 100% uptime machine.

The remaining hardware problem is that pulseaudio is still terrible after all these years, so I have to repeatedly switch audio out to hdmi.

colordrops1y ago

Use pipewire instead of pulseaudio. Much better.

1 more reply

stuaxo1y ago

Having had AMS Ryzen laptops for the last 6 plus years, so much this.

Right now I'm messing around trying to get pytorch vulkcan support compiling just so I avoid switching to ROCM.

torrance1y ago· 6 in thread

jauntywundrkind1y ago

luyu_wuOP1y ago

Very valid concerns! AMD's current die-to-die interconnects have some pretty abysmal energy/bit. Really hope they can pull off something similar to Intel's EMIB maybe?

1 more reply

_cenw1y ago

The 7945HX3D needs 55W minimum, if that's any indicator.

KingOfCoders1y ago

I hope Strix Halo gets a desktop motherboard (no socket :-( for the memory bandwidth for faster compiles (Go). That or a 9950X3D (like the 7950X3D).

naoru1y ago

Me too. There's at least one manufacturer who makes pretty sweet mini-ITX motherboard with R9 7945HX, I hope they will follow up with Strix Halo once it's released.

layer81y ago

That kind of performance will still require significant cooling, which if you want it to be quiet is helped by a larger box.

Shorel1y ago· 3 in thread

Similar performance to Nvidia 1080 dedicated GPU.

Would I get it? Absolutely yes. A full desktop small form factor is a very convenient, nice thing.

dagmx1y ago

Where do you see a performance comparison for the 1080?

The only mention of NVIDIA in the post is of the 1050 which is a considerable step away from a 1080.

> It also moves ahead of Nvidia’s Pascal based GTX 1050 3 GB

lhl1y ago

From Notebookcheck's benchmarks it looks like the Radeon 890M is punching at about a GeForce 1650 Mobile's performance: https://www.notebookcheck.net/Computer-Games-on-Laptop-Graph...

Based on https://www.techpowerup.com/gpu-specs/geforce-gtx-1650-mobil... this is about 40% faster than a GTX 1050, but also almost half the speed of a GTX 1080.

1 more reply

Shorel1y ago

Sorry, I forgot to add it was the mobile version of the 1080, which is indeed slower than full form factor.

bornfreddy1y ago· 2 in thread

Interesting:

Did AMD figure out that this market segment is underserved by NVidia? If so, good for them, laptops could use better GPUs.

dagmx1y ago

I doubt Strix Point is gunning for NVIDIA.

It’s more than likely this is just a stronger play to get ahead of Intel in market share.

That’s a much more tangible competitor in that space.

mmaniac1y ago

Nvidia can't really enter this segment unless Windows on ARM takes off, and they don't want to be the one to put the first foot forward.

If Snapdragon X Elite is a success, you can bet Nvidia will be producing laptop SoCs with passable CPUs and great iGPUs.

aurareturn1y ago· 2 in thread

Some comparisons:

4k Aztec High GFX

* AMD 890M: 39.1fps

* M3: 51.8fps

3DMark Wild Life Extreme

* AMD 890M: 7623

* M3: 8286

Power:

* AMD 890M: 46w

* M3: 8286: 17w

M3 about ~253% more efficient.

But of course, if your goal is gaming, AMD's GPU will still be better because of Vulkan, DirectX, and Windows support. In pure architecture, AMD is quite a bit behind Apple.

adrian_b1y ago

The "170%" number is bogus.

Reducing the power of 890M to 17 W, the same as quoted for M3, would reduce the performance much less than the reduction in power consumption, improving the energy efficiency.

For a valid comparison of the energy efficiency, both systems must be configured for the same power consumption.

Moreover, by themselves those performance values do not prove that AMD is behind Apple in GPU architecture.

The better performance of the Apple GPU could be entirely caused by the much higher memory bandwidth and by the better CMOS process used for the Apple GPU.

For any conclusions about architecture, much more detailed tests would be needed, to separate the effects of the other differences that exist between these systems.

aurareturn1y ago

>The "170%" number is bogus.

Actually, it's 253%. I made a mistake assuming the 890M was limited to 35w. It was actually 46w as measured by Notebookcheck.[0]

>Reducing the power of 890M to 17 W, the same as quoted for M3, would reduce the performance much less than the reduction in power consumption, improving the energy efficiency.

That depends. Sure, give almost any chip less power and it will be more efficient. I'm not arguing against that.

The problem with reducing power for the 890M is that it's already slower than the M3 by 26% while using 2.7x the power.

If you give the 890M 17w, yes, it will be more efficient than 46w. It just just be even slower than the M3.

>The better performance of the Apple GPU could be entirely caused by the much higher memory bandwidth

M3's bandwidth is 102.4 GB/s. AMD Strix Point uses LPDDR5X-7500 in dual channel mode so it should be around 120GB/s.

>and by the better CMOS process used for the Apple GPU.

AMD's Strix Point is manufactured on TSMC's N4P. M3 is on N3B, which is roughly 10% more power efficient than N4P. It doesn't explain the huge discrepancy in efficiency.

[0]https://www.notebookcheck.net/AMD-Zen-5-Strix-Point-iGPU-ana...

setgree1y ago· 2 in thread

So what was AMD thinking with its release of the 8700G and 8600G APUs, and is it planning to phase them out?

I've commented on this before [3], can anyone shed light on the situation?

[0] https://www.anandtech.com/show/21485/the-amd-ryzen-ai-hx-370...

[1] https://www.tomshardware.com/desktops/mini-pcs/soyos-upcomin...

[2] https://www.tomshardware.com/desktops/gaming-pcs/hp-omen-35l...

[3] https://news.ycombinator.com/item?id=41140287

luyu_wuOP1y ago

setgree1y ago

1 more reply

mastax1y ago· 1 in thread

How fortunate for Intel that as soon as they ruin their CPU naming scheme, AMD follow suit.

mmaniac1y ago

Intel on the other hand were fairly sensible. i7 becomes Ultra 7 and the numbering restarts from 100 (Meteor Lake). That's easy to follow.

j / k navigate · click thread line to collapse