AMD Unveils Ryzen 9000 CPUs for Desktop, Zen 5 (opens in new tab)

(anandtech.com)

334 pointsrx_tx1y ago292 comments

292 comments

AVX512 in a single cycle vs 2 cycles is big if the clock speed can be maintained at all near 5GHz. Also doubling of L1 cache bandwidth is interesting! Possibly, needed to actually feed an AVX512 rich instruction stream I guess.

adrian_b1y ago

For most instructions, both Intel and AMD CPUs with AVX-512 support are able to do two 512-bit instructions per clock cycle. There is no difference between Intel and AMD Zen 4 for most 512-bit AVX-512 instructions.

I expect that this will remain true for Zen 5 and the next Intel CPUs.

The only important differences in throughput between Intel and AMD were for the 512-bit load and store instructions from the L1 cache and for the 512-bit fused multiply-add instructions, where Intel had double throughput in its more expensive models of server CPUs.

I interpret AMD's announcement that now Zen 5 has a double transfer throughput between the 512-bit registers and the L1 cache and also a double 512-bit FP multiplier, so now it matches the Intel AVX-512 throughput per clock cycle in all important instructions.

Aardwolf1y ago

> There is no difference between Intel and AMD Zen 4 for most 512-bit AVX-512 instructions.

Except for the fact that Intel hasn't had any AVX-512 for years already in consumer CPUs, so there's nothing to compare against really in this target market

adrian_b1y ago

The comparison done by the AMD announcement and by everyone else compares the Zen 5 cores, which will be used both in their laptop/desktop products and in their Turin server CPUs, with the Intel Emerald Rapids and the future Granite Rapids server CPUs.

As you say, Intel has abandoned the use of the full AVX-512 instruction set in their laptop/desktop products and in some of their server products.

At the end of 2025, Intel is expected to introduce laptop/desktop CPUs that will implement a 256-bit subset of the AVX-512 instruction set.

While that will bring many advantages of AVX-512 that are not related to register and instruction widths, it will lose the simplification of the high-performance programs that is possible in 512-bit AVX-512 due to the equality between register size and cache line size, so the consumer Intel CPUs will remain a worse target for the implementation of high-performance algorithms.

1 more reply

tempnow9871y ago

The difference is intel chips that support AVX-512 run $1,300 - $11,000 with MUCH higher total system costs whereas AMD actually DOES support AVX-512 on all it's chips and you can get AVX-512 for dirt cheap. The whole intel instruction support story feels garbage here. Weren't they the ones to introduce this whole 512 thing in the first place?

shrubble1y ago

My view is that Intel is trying to do market segmentation. Want AVX512? You need to buy the "pro" lineup of CPUs...

4 more replies

DEADMINCE1y ago

> The only important differences in throughput between Intel and AMD

Not exactly related, but AMD also has a much better track record when it comes to speculative execution attacks.

xattt1y ago

I see the discussion of instruction fusion for AVX512 in Intel chips. Can someone explain the clock speed drop?

adrian_b1y ago

A clock speed drop must always occur whenever a CPU does more work per clock cycle, e.g. when more CPU cores are active or when wider SIMD instructions are used.

However the early generations of Intel CPUs that have implemented AVX-512 had bad clock management, which was not agile enough to lower quickly the clock frequency, i.e. the power consumption, when the temperature was too high due to higher power consumption, in order to protect the CPU. Because of that and because there are no instructions that the programmers could use to announce their intentions of using intensively wide SIMD instructions in a sequence of code, the Intel CPUs lowered the clock frequency preemptively and a lot, whenever they feared that the future instruction stream might contain 512-bit instructions that could lead to an overtemperature. The clock frequency was restored only after delays not much lower than a second. When AVX-512 instructions were executed sporadically, that could slow down any application very much.

The AMD CPUs and the newer Intel CPUs have better clock management, which reacts more quickly, so the low clock frequency during AVX-512 instruction execution is no longer a problem. A few AVX-512 instructions will not lower measurably the clock frequency, while the low clock frequency when AVX-512 instructions are executed frequently is compensated by the greater work done per clock cycle.

1 more reply

camel-cdr1y ago

AVX512 was never over 2 cycles. In Zen4 it used the 256 wide execution units of avx2 (except for shuffle), but there are more then one 256-bit wide execution units, so you still got your one cycle throughput.

dzaima1y ago

More importantly for the "2 cycles" question, Zen 4 can get one cycle latency for double-pumped 512-bit ops (for the ops where that's reasonable, i.e. basic integer/bitwise arith).

Having all 512-bit pipes would still be a massive throughput improvement over Zen 4 (as long as pipe count is less than halved), if that is what Zen 5 actually does; things don't stop at 1 op/cycle. Though a rather important question with that would be where that leaves AVX2 code.

camel-cdr1y ago

> Having all 512-bit pipes would still be a massive throughput improvement over Zen 4 (as long as pipe count is less than halved)

What would be different between doubling tbe pipe width vs number of pipes? (excluding inter lane operations that already had their own 512-bit pipe in Zen4)

1 more reply

wren69911y ago

Double pumping loses its one-cycle latency when you forward to an operation that is genuinely 512 bits wide, like a shuffle. At that point you have to wait for both halves to be available before dispatching. That's another advantage to going fully 512-bit wide

SomeoneFromCA1y ago

I wish it had AVX512 Fp16.

api1y ago

At what point do these become competitive with GPUs for AI cost wise if GPUs retain their nutty price premium?

bloaf1y ago

I’ve been running some LLMs on my 5600x and 5700g cpus, and the performance is… ok but not great. Token generation is about “reading out loud” pace for the 7&13 B models. I also encounter occasional system crashes that I haven’t diagnosed yet, possibly due to high RAM utilization, but also possibly just power/thermal management issues.

A 50% speed boost would probably make the CPU option a lot more viable for home chatbot, just due to how easy it is to make a system with 128gb RAM vs 128gb VRAM.

I personally am going to experiment with the 48gb modules in the not too distant future.

phkahler1y ago

You could put an 8700G in the same socket. The CPU isn't much faster but it has the new NPU for AI. I'm thinking about this upgrade to my 2400G but might want to wait for the new socket and DDR5.

1 more reply

cschneid1y ago

I looked at upgrading my existing AMD based system's ram for this purpose, but found out my mobo/cpu only supports 128gb of ram. Lots, but not as much as I had hoped I could shove in there.

hajile1y ago

Does Zen5 do FP math in a single cycle?

dzaima1y ago

Almost certainly Zen 5 won't have single-cycle FP latency (I haven't heard of anything doing such even for scalar at modern clock rates (though maybe that does exist somewhere); AMD, Intel, and Apple all currently have 3- or 4-cycle latency). And Zen 4 already has a throughput of 2 FP ops/cycle for up to 256-bit arguments.

The thing discussed is that Zen 4 does 512-bit SIMD ops via splitting them into two 256-bit ones, whereas Zen 5 supposedly will have hardware doing all 512 bits at a time.

adrian_b1y ago

Even if Lisa Su said this at the Zen 4 launch, it is not likely that 512-bit operations are split into a pair of 256-bit operantions that are executed sequentially in the same 256-bit execution unit.

Both Zen 3 and Zen 4 have four 256-bit execution units.

Two 512-bit instructions can be initiated per clock cycle. It is likely that the four corresponding 256-bit micro-operations are executed simultaneously in all the 4 execution units, because otherwise there would be an increased likelihood that the dispatcher would not be able to find enough micro-operations ready for execution so that no execution unit remains idle, resulting in reduced performance.

The main limitation of the Zen 4 execution units is that only 2 of them include FP multipliers, so the maximum 512-bit throughput is one fused multiply-add plus one FP addition per clock cycle, while the Intel CPUs have an extra 512-bit FMA unit, which stays idle and useless when AVX-512 instructions are not used, but which allows two 512-bit FMA per cycle.

Without also doubling the transfer path between the L1 cache and the registers, a double FMA throughput would not have been beneficial for Zen 4, because many algorithms would have become limited by the memory transfer throughput.

Zen 5 doubles the width of the transfer path to the L1 and L2 cache memories and it presumably now includes FP multipliers in all the 4 execution units, thus matching the performance of Intel for 512-bit FMA operations, while also doubling the throughput of the 256-bit FMA operations, where in Intel CPUs the second FMA unit stays unused, halving the throughput.

No well-designed CPU has a FP addition or multiplication latency of 1. All modern CPUs are designed for the maximum clock frequency which ensures that the latency of operations similar in complexity with 64-bit integer additions between registers is 1. (CPUs with a higher clock frequency than this are called "superpipelined", but they have went out of fashion a few decades ago.)

For such a clock frequency, the latency of floating-point execution units of acceptable complexity is between 3 and 5, while the latency of loads from the L1 cache memory is about the same.

The next class of operations with a longer latency includes division, square root and loads from the L2 cache memory, which usually have latencies between 10 and 20. The longest latencies are for loads from the L3 cache memory or from the main memory.

1 more reply

dhx1y ago

How are the 24x PCIe 5.0 lanes (~90GB/s) of the 9950X allocated?

The article makes it appear as:

* 16x PCIe 5.0 lanes for "graphics use" connected directly to the 9950X (~63GB/s).

* 1x PCIe 5.0 lane for an M.2 port connected directly to the 9950X (~4GB/s). Motherboard manufacturers seemingly could repurpose "graphics use" PCIe 5.0 lanes for additional M.2 ports.

* 7x PCIe 5.0 lanes connected to the X870E chipset (~28GB/s). Used as follows:

  * 4x USB 4.0 ports connected to the X870E chipset (~8GB/s).

  * 4x PCIe 4.0 ports connected to the X870E chipset (~8GB/s).

  * 4x PCIe 3.0 ports connected to the X870E chipset (~4GB/s).

  * 8x SATA 3.0 ports connected to the X870E chipset (some >~2.4GB/s part of ~8GB/s shared with WiFi 7).

  * WiFi 7 connected to the X870E chipset (some >~1GB/s part of ~8GB/s shared with 8x SATA 3.0 ports).

wtallis1y ago

These processors will use the existing AM5 socket, so they fundamentally cannot make major changes to lane counts and allocations, only per-lane speeds. They're also re-using CPU's IO die from last generation and re-using the same chipset silicon, which further constrains them to only minor tweaks.

Typical use cases and motherboards give an x16 slot for graphics, x4 each to at least one or two M.2 slots for SSDs, and x4 to the chipset. Last generation and this generation, AMD's high-end chipset is actually two chipsets daisy-chained, since they're really not much more than PCIe fan-out switches plus USB and SATA HBAs.

Nobody allocates a single PCIe lane to an SSD slot, and the link between the CPU and chipset must have a lane width that is a power of two; a seven-lane link is not possible with standard PCIe.

Also, keep in mind that PCIe is packet-switched, so even though on paper the chipset is over-subscribed with downstream ports that add up to more bandwidth than the uplink to the CPU provides, it won't be a bottleneck unless you have an unusual hardware configuration and workload that actually tries to use too much IO bandwidth with the wrong set of peripherals simultaneously.

dhx1y ago

Thanks for the description. The article was confusing as to whether the CPU's stated 24x PCIe 5.0 lanes included those required for the chipset. Given that the same AM5 socket is used and X870E is similar to the X670E, this appears to not be the case, and instead the 9950X would have 28x PCIe 5.0 lanes, 4 of which are connected to the daisy-chained chipset and 24 then remain available to the motherboard vendor (nominally as 16 for graphics, 8 for NVMe). I also hadn't realised the CPU would offer 4x USB 4.0 ports directly.

Block diagram for AM5 (X670E/X670): https://www.techpowerup.com/review/amd-ryzen-9-7950x/images/...

Block diagram for AM4 (X570): https://www.reddit.com/r/Amd/comments/bus60i/amd_x570_detail...

adrian_b1y ago

Cheap small computers with Intel Alder Lake N CPUs, like Intel N100, allocate frequently a single PCIe lane for each SSD slot.

However you are right that such a choice is very unlikely for computers using AMD CPUs or Intel Core CPUs.

adgjlsfhk11y ago

with pcie 5, 2x is almost certainly enough for all but the absolute fastest drives.

lmz1y ago

It's usually x16/x4/x4 for GPU/M.2/Chipset. You can check the diagrams from the current x670 boards for info.

wtallis1y ago

AMD's previous socket was usually x16/x4/x4 for GPU/M.2/chipset. For AM5, they added another four lanes, so it's usually x16/x4+x4/x4 for GPU/(2x)M.2/chipset, unless the board is doing something odd with providing lanes for Thunderbolt ports or something like that.

lmz1y ago

Oh you're right. I didn't read the diagram correctly and the specs say 28 total lanes, 24 usable - probably because 4 go to the chipset.

paulmd1y ago

Siena would make a very practical HEDT socket - it's basically half of a bergamo, 6ch DDR5/96x pcie 5.0. It's sort of an unfortunate artifact of the way server platforms have gone that HEDT has fizzled out, they're mostly just too big and it isn't that practical to fit into commodity form-factors anymore, etc. a bigass socket sp3 and 8ch was already quite big, now it's 12ch for SP5 and you have a slightly smaller one at SP6. But still, doing 1DPC in a commodity form factor is difficult, you really need an EEB sort of thing for things like GENOAD8X etc let alone 2 dimms per channel etc, which if you do like a 24-stick board and a socket you don't fit much else.

https://www.anandtech.com/show/20057/amd-releases-epyc-8004-...

2011/2011-3/2066 were actually a reasonable size. Like LGA3678 or whatever as a hobbyist thing doesn't seem practical (the W-3175X stuff) and that was also 6ch, and Epyc/TR are pretty big too etc. There used to exist this size-class of socket that really no longer gets used, there aren't tons of commercial 3-4-6 channel products made anymore, and enthusiast form-factors are stuck in 1980 and don't permit the larger sockets to work that well.

The C266 being able to tap off IOs as SAS3/12gbps or pcie 4.0 slimsas is actually brilliant imo, you can run SAS drives in your homelab without a controller card etc. The Asrock Rack ones look sick, EC266D4U2-2L2Q/E810 lets you basically pull all of the chipset IO off as 4x pcie 4.0x4 slimsas if you want. And actually you can technically use MCIO retimers to pull the pcie slots off, they had a weird topology where you got a physical slot off the m.2 lanes, to allow 4x bifurcated pcie 5.0x4 from the cpu. 8x nvme in a consumer board, half in a fast pcie 5.0 tier and half shared off the chipset.

https://www.asrockrack.com/general/productdetail.asp?Model=E...

Wish they'd do something similar with AMD and mcio preferably, like they did with the GENOAD8X. But beyond the adapter "it speaks SAS" part is super useful for homelab stuff imo. AMD also really doesn't make that much use of the chipset, like, where are the x670E boards that use 2 chipsets and just sling it all off as oculink or w/e. Or mining-style board weird shit. Or forced-bifurcation lanes slung off the chipset into a x4x4x4x4 etc.

https://www.asrockrack.com/general/productdetail.asp?Model=G...

All-flash is here, all-nvme is here, you just frustratingly can't address that much of it per system, without stepping up to server class products etc. And that's supposed to be the whole point of the E series chipset, very frustrating. I can't think of many boards that feel like they justify the second chipset, and the ones that "try" feel like they're just there to say they're there. Oh wow you put 14 usb 3.0 10gbps ports on it, ok. How about some thunderbolt instead etc (it's because that's actually expensive). Like tap those ports off in some way that's useful to people in 2024 and not just "16 sata" or "14 usb 3.0" or whatever. M.2 NVMe is "the consumer interface" and it's unfortunately just about the most inconvenient choice for bulk storage etc.

Give me the AMD version of that board where it's just "oops all mcio" with x670e (we don't need usb4 on a server if it drives up cost). Or a miner-style board with infinite x4 slots linked to actual x4s. Or the supercarrier m.2 board with a ton of M.2 sticks standing vertically etc. Nobody does weird shit with what is, on paper, a shit ton of pcie lanes coming off the pair of chipsets. C'mon.

Super glad USB4 is a requirement for X870/X870E, thunderbolt shit is expensive but it'll come down with volume/multisourcing/etc, and it truly is like living in the future. I have done thunderbolt networking and moved data ssd to ssd at 1.5 GB/s. Enclosures are super useful for tinkering too now that bifurcation support on PEG lanes has gotten shitty and gpus keep getting bigger etc. An enclosure is also great for janitoring M.2 cards with a simple $8 adapter off amazon etc (they all work, it's simple physical adapater).

ComputerGuru1y ago

Very well said. It feels like we have all this amazing tech that could open the door to so many creative possibilities but no one is interested in exploring it.

Before, we had so little but it was all available to utilize to the fullest extent. Now we live in a world of excess but it’s almost a walled garden.

vardump1y ago

I think that decision is ultimately made by the mainboard vendor.

irusensei1y ago

I'll probably wait one or two years before getting into anything with DDR5. I've blew some money on an AMD laptop in 2021. At the time it was a monster with decent expansion options: RX 6800m, Ryzen 9 5900HX. I've stuck it with maximum 64GB DDR4 and 2x 4TB psi 3.0 nvme. Runs Linux very well.

But now I'm seeing lots of things I'm locked out. Faster ethernet standards, the fun that brings with tons of GPU memory (no USB4, can't add 10Gbe either), faster and larger memory options, AV1 encoding. It's just sad that I bought a laptop right before those things were released.

Should had go with a proper PC. Not doing this mistake anymore.

isoos1y ago

It sounds like you need a desktop workstation with replaceable extension cards, and not a mostly immutable laptop, which has different strengths.

irusensei1y ago

Agreed but it will need to wait for now.

worthless-trash1y ago

You will find that this is the cost of any laptop, any time you buy it there is always new tech around the corner and there isn't much you can do about it.

simcop23871y ago

(disclosure I own a 13in one)

Yea closest I see to being better about it is Frame.work laptops, and even then it's not as good a story as desktops, just the best story for upgrading a laptop right now. Other than that buying one and making sure you have at least two thunderbolt (or compatible) ports on separate busses is probably the best you can do since that'd mean two 40Gb/s links for expansion even if it's not portable, but would let you get things like 10GbE adapters or fast external storage and such without compromising too much on capability.

zamadatix1y ago

The idea of Framework sounds so good but their actual implementation has really been lacking so far, especially when you consider total price of the laptop + upgrades vs just buying cheaper devices whole but more often. E.g. wired Ethernet is a USB C 2.5 Gbps adapter that sticks out of the chasses because it doesn't fit.

kmfrk1y ago

Waiting for CAMM2 to get wider adoption could be interesting:

https://x.com/msigaming/status/1793628162334621754

Hopefully won't be too long now.

Delmololo1y ago

Either it was a shitty investment from the beginning or you actually use it very regularly and it would be worth it anyway to slowly thinking about something new.

gautamcgoel1y ago

Surprisingly not that much to be excited about IMO. AMD isn't using TSMC's latest node and the CPUs only officially support DDR5 speeds up to 5600MHz (yes, I know that you can use faster RAM). The CPUs are also using the previous-gen graphics architecture, RDNA2.

mrweasel1y ago

> AMD isn't using TSMC's latest node

Staying on an older node might ensure AMD the production capacity they need/want/expect. If they had aimed for the latest 3nm then they'd have get in line behind Apple and Nvidia. That would be my guess, why aim for 3nm, if you can't get fab time and you're still gaining a 15% speed increase.

arvinsim1y ago

TBH, CPUs nowadays are mostly good enough for the consumer, even at mid or low tiers.

It's the GPUs that are just getting increasing inaccessible, price wise.

michaelt1y ago

Yes - with more and more users moving to laptops and wanting a longer battery life, raw peak performance hasn't moved much in a decade.

A decade ago, Steam's hardware survey said 8GB was the most popular amount of RAM [1] and today, the latest $1600 Macbook Pro comes with.... 8GB of RAM.

In some ways that's been a good thing - it used to be that software got more and more featureful/bloated and you needed a new computer every 3-5 years just to keep up.

[1] https://web.archive.org/web/20140228170316/http://store.stea...

vitus1y ago

> raw peak performance hasn't moved much in a decade.

In general, CPU clock speeds stagnated about 20 years ago because we hit a power wall.

In 1985, the state of the art was maybe 15-20MHz; in 1995, that was 300-500MHz; in 2005, we hit about 3GHz and we've made incremental progress from there.

It turns out that you can only switch voltages across transistors so many times a second before you melt down the physical chip; reducing voltage and current helps but at the expense of stability (quantum tunneling is only becoming a more significant source of leakage as we continue shrinking process sizes).

Most of the advancements over the past 20 years have come from pipelining, increased parallelism, and changes further up the memory hierarchy.

> today, the latest $1600 Macbook Pro comes with.... 8GB of RAM.

That's an unfair comparison. Apple has a history of shipping less RAM with its laptops than comparable PC builds (the Air shipped with 2GB in the early 2010s, eventually climbing up to 8GB by the time the M1 launched).

Further, the latest iteration of the Steam hardware survey shows that 80% of its userbase has at least 16GB of RAM, whereas in 2014 8GB was merely the plurality; not even 40% of users had >= 8GB. A closer comparison point would have been the 4GB mark, which 75% of users met or exceeded.

1 more reply

antisthenes1y ago

> it used to be that software got more and more featureful/bloated and you needed a new computer every 3-5 years just to keep up.

I'm sorry, "used to be" ? 90% of the last decade of hardware advancement was eaten up by shoddy bloated software, where we now have UI lag on the order of seconds, 8GB+ of memory used all the time by god knows what and a few browser tabs and 1 core always peaking in util (again, doing god knows what).

1 more reply

CivBase1y ago

To be fair, a decade ago gaming PCs came with 2GB to 4GB of vRAM. Today's gaming PCs come with 12GB to 20GB of vRAM. Most games don't demand a lot of system memory, so it makes sense that PC gamers would invest in other components.

You're also comparing Windows x86 gaming desktops from a decade ago with macOS AppleSilicon base-spec laptops today. Steam's recent hardware survey shows 16GB as the most popular amount of RAM [1]. Not the 5x increase we've seen in vRAM, but still substantial.

[1] https://store.steampowered.com/hwsurvey/Steam-Hardware-Softw...

2 more replies

faeriechangling1y ago

I honestly think that given the demands of 4K video specifically using potentially a few gigs of memory just for decoding, 8gb made a world of sense, but little has come out since that really needs all that much memory for the average person.

When the industry moves to lpddr6/ddr6 I wouldn’t be shocked to see an increase to 6gb per module standard although maybe some binned 4gb modules will still be sold.

MenhirMike1y ago

Given that there are only 2 CUs in the GPU (and fairly low clock speeds), does the architecture matter much? Benchmarks were kinda terrible, and it looks to me that the intent of the built-in GPU is for hardware video encoding or to run it in a home server system, or emergency BIOS and the like. Compared to the desktop CPUs, even the lowest end mobile 8440U has 4 CUs, going up to 12 CUs on the higher end. Or go with Strix Point, which does have an RDNA 3.5 GPU (with 12 or 16 CUs) in it.

I guess you _can_ game on those 2 CU GPUs, but it really doesn't seem to be intended for that.

spixy1y ago

Better efficiency with three external 4K 120Hz monitors?

MenhirMike1y ago

Oh, there's plenty of good uses for it, but I was specifically wondering why the previous poster cared for it being RDNA 2 instead of RDNA 3 since I don't think it makes a difference outside of gaming, and it's a pretty bad gaming GPU because of the low core count.

So I was curious if there was anything else that RDNA 3/3.5 would offer over RDNA 2 in such a low end configuration.

1 more reply

cschneid1y ago

Yeah, I'm glad they started including built in gpu so there's something there, but beyond booting to a desktop I wouldn't use this graphics for anything else. But if you're just running a screen and compiling rust, that's all you need. Or in my case, running a home server / NAS.

philsnow1y ago

When building a non-APU ryzen machine for homelab use, I ended up buying the very cheapest graphics card I could find that was compatible, a "GeForce GT 710" that was not a beefy card when it was released in 2014. It's.. fine. After getting the system working I passed it through to a win10 VM and I can play non-FPS windows-only steam games over RDP.

So yeah next time I build a machine I'll appreciate having this built in.

1 more reply

AlfeG1y ago

> DDR5 speeds up to 56000MHz (yes, I know that you can use faster RAM)

Not sure that I actually CAN. 56 GHz is already a lot.

gautamcgoel1y ago

Fixed, thanks :)

orphea1y ago

5600 are MT/s (megatransfers/second), not MHz ;)

re-thc1y ago

> The CPUs are also using the previous-gen graphics architecture, RDNA2

Faster GPU is reserved for APUs. These graphics are just here for basic support.

diffeomorphism1y ago

Nah, you can get RDNA3.5 if you want to (not sure why you want that in a (home)server though)

https://www.anandtech.com/show/21419/amd-announces-the-ryzen...

Arrath1y ago

Well perhaps I will stop holding out and just get the 7800x3d, if the 9000 generation won't be too terribly groundbreaking.

doikor1y ago

> The CPUs are also using the previous-gen graphics architecture, RDNA2.

The GPU on these parts is there mostly for being able to boot into BIOS or OS for debugging. Basically when things go wrong and you want to debug what is broken (remove GPU from machine and see if things work)

matharmin1y ago

These are decent GPUs for anything other than heavy gaming. I'm driving two 4k screens with it, and even for some light gaming (such as factorio) it's completely fine.

123yawaworht4561y ago

I'm under the impression that Factorio can run on any GPU capable of producing video output at all. years ago when I played it, it ran perfectly fine on whatever iGPU my 4790K had. 60 FPS/UPS with pretty big bases (although iirc I did disable all video effects like smoke to avoid cluttering the screen)

1 more reply

user_78321y ago

I agree, my 780m is quite capable in most games. Depending on the resolution & settings even cyberpunk 2077 is playable at 60fps. MS Flight sim though hits (presumably the memory bandwidth bottleneck) hard.

2 more replies

moooo991y ago

Hard disagree on that one. I am daily driving an RDNA2 graphics unit for 1.5 years now and it’s absolutely sufficient. I mostly do office work and occasionally play Minecraft. It’s absolutely sufficient for that and I don’t see any reason why you‘d want to waste money on a dGPU for that kind of load

nickjj1y ago

Another advantage of having an integrated GPU is you can do a GPU pass-through and let a VM directly and fully use your dedicated GPU.

This could be a thing if you're running native Linux but some games only work on Windows which you run in a VM instead of dual booting.

ffsm81y ago

I have to disagree. They work great for video playback and office work. So media server, and workstations are fine without a dedicated gpu

ubercore1y ago

> The GPU on these parts is there mostly for being able to boot into BIOS or OS for debugging.

That's wildly not true. Transcoding, gaming, multiple displays, etc. They are often used as any other GPU would be used.

TacticalCoder1y ago

> The GPU on these parts is there mostly for being able to boot into BIOS or OS for debugging.

Not at all. I drive a 38" monitor with the iGPU of the 7700X. If you don't game and don't run local AI models it's totally fine.

And... No additional GPU fans.

My 7700X build is so quiet it's nearly silent. I can barely hear it's Noctua NH-12S cooler/fan ramping up when under full load and that's how it should be.

JonChesterfield1y ago

They also mean you can drive monitors using the builtin GPU while using dedicated ones for compute.

btgeekboy1y ago

Yeah - I've been waiting to see what this release would entail as I kinda want to build a SFF PC. But now that I know what's in it, and since they didn't come out with anything really special chipset-wise, I'll probably just see if I can get some current-get stuff at discounted prices during the usual summer sales.

aurareturn1y ago

It's because x86 chips are no longer leading in the client. ARM chips are. Specifically, Apple chips. Though Qualcomm has huge potential leapfrog AMD/Intel chips in a few generations too.

KingOfCoders1y ago

[If you're a laptop user, scroll down the thread for laptop Rust compile times, M3 Pro looks great]

You're misguided.

Apple has excellent Notebook CPUs. Apple has great IPC. But AMD and Intel have easily faster CPUs.

https://opendata.blender.org/benchmarks/query/?compute_type=...

Blender Benchmark

      AMD Ryzen 9 7950X (16 core)         560.8
      Apple M2 Ultra (24 cores)           501.82
      Apple M3 Max (12 cores)             408.27
      Apple M3 Pro                        226.46
      Apple M3                            160.58

It depends on what you're doing.

I'm a software developer using a compiler that 100%s all cores. I like fast multicore.

      Apple Mac Pro, 64gb, M2 Ultra, $7000
      Apple Mac mini, 32gb, M2 Pro, 2TB SSD, $2600

[Edit2] Compare to: 7950x is $500 and a very fast SSD is $400, fast 64gb is $200, very good board is $400 so I get a very fast dev machine for ~$1700 (0,329 p/$ vs. mini 0,077 p/$)

[Edit] Made a c&p mistake, the mini has no ultra.

aurareturn1y ago

That seems wrong.

Though Blender may have an optimization for avx512 but not for SME or Neon.

But the vast majority will use GPUs to do rendering for Blender.

Try SPEC or its close consumer counterpart, Geekbench.

As an anecdote, all my Python and Node.js applications run faster on Apple Silicon than Zen4. Even my multithread Go apps seem to run better on Apple Silicon.

4 more replies

hajile1y ago

There's something wrong with your M3 Max stuff. I believe it comes in 14 and 16-core variants while the M3 Pro comes in 11 and 12-core variants.

In any case, M3 Max uses less than 55w of power in CPU-only workloads while a desktop 7950x peaked out at 332w of power according to Guru3D (without an OC).

The fact that M2 Ultra hits so close while peaking out at only around 100w of CPU power is pretty crazy (M2 Ultra doesn't even hit 300w with all CPU and GPU cores maxed out).

2 more replies

wtallis1y ago

> Blender Benchmark

Maybe use a benchmark that actually makes sense for CPUs, rather than something that's always much faster on a GPU (eg. M3 Pro as any sane user would use it for Blender is 2.7x the performance of a Ryzen 7950X, not 0.4x).

> Apple Mac mini, 32gb, M2 Ultra, 2TB SSD, $2600

Not a real thing. You meant M2 Pro, because the Max and Ultra chips aren't available in the Mac mini.

2 more replies

krasin1y ago

Until ARM has a proper UEFI support, it's not a practical desktop/server with a few notable exceptions (Mac, Raspberry Pi) and only because there's so much support from the respective vendors.

I know that there's some work happening about UEFI+ARM (https://developer.arm.com/Architectures/Unified%20Extensible...), but its support is very rare. The only example I can recall is Ampere Altra: https://www.jeffgeerling.com/blog/2023/ampere-altra-max-wind...

walterbell1y ago

Thanks to ex-Apple Nuvia/Oryon ("Qualcomm Snapdragon X Elite"), Arm laptops will launch in the next few months from Microsoft, Dell, HP, Lenovo, Asus and other OEMs, with UEFI support for Windows and in-progress support for Linux, https://news.ycombinator.com/item?id=40422286

bitwize1y ago

It's like I keep saying: the first Chinese manufacturer to churn out cheap SBCs with ServerReady support will make a killing as a true Pi killer. Anyone? Anyone? Pine64? Pine64?

1 more reply

craftkiller1y ago

ARM needs more than just proper UEFI support: Microsoft needs to lift the secureboot restrictions on ARM.

x86: Microsoft requires that end-users are allowed to disable secure boot and control which keys are used.

arm: Microsoft requires that end-users are not allowed to disable secure boot

This isn't a hardware issue, but simply a policy issue that Microsoft could solve with a stroke of a pen, but since Microsoft is such a behemoth in the laptop space, their policies control the non-apple market.

source: https://mjg59.dreamwidth.org/23817.html

mmaniac1y ago

AMD seem to be playing it safe this with this desktop generation. Same node, similar frequencies, same core counts, same IOD, X3D chips only arriving later... IPC seems like the only noteworthy improvement here. 15% overall is good but nothing earth shattering.

The mobile APUs are way more interesting.

TacticalCoder1y ago

> 15% overall is good but nothing earth shattering

Interestingly though the 9700X seems to be rated at 65W TDP (compared to a 105 TDP for the 7700X). I run my 7700X in "eco mode" where I lowered the TDP to max 95 W (IIRC, maybe it was 85 W: I should check in the BIOS).

So it looks like it's 15% overall more power with less power consumption.

hajile1y ago

AMD used to give decently accurate TDP, but Intel started giving unrealistically optimistic TDP ratings, so AMD joined the game. Their TDP is more of an aspiration than a reality (ironically, Intel seems to have gotten more accurate with their new base and turbo power ratings).

9700x runs 100MHz higher on the same process as the 7700x. If they are actually running at full speed, I don't see how 9700x could possibly be using less power with more transistors at a higher frequency. They could get lower power for the same performance level though if they were being more aggressive about ramping down the frequency (but it's a desktop chip, so why would they?).

ComputerGuru1y ago

I know AMD has joined the “fudging the TDP numbers” game but you should still be able to more or less compare TDP across their own products.

2 more replies

vegabook1y ago

Geekbench Ryzen 9 7950x is 2930 max, so if we're generous and give the 9950x 15% uplift we'll be at 3380, which is still 400 points or so behind apple silicon for a much higher clock speed and a multiplier larger power draw. Also the max memory bandwidth at 70GB/s or so is basically pathetic, trounced by ASi.

mmaniac1y ago

You're comparing apples and oranges. Ryzen has never had the lead with 1T performance, but emphasises nT and core counts instead. Memory bandwidth is largely meaningless for desktop CPUs but matters a lot for a SoC with a big GPU.

Strix Halo appears to be AMD's competitor to Apple SoCs which will feature a much bigger iGP and much greater memory bandwidth. When we hear more about that, comparisons will be apt.

aurareturn1y ago

Ryzen doesn't even lead in MT on laptops.

With M4, they're likely to fall even farther behind. M4 Pro/Max is likely to arrive in Fall. AMD's Strix Point doesn't seem to have a release date.

1 more reply

vegabook1y ago

AMD has been making SoCs forever. Why does it take a kick in the teeth from Apple for both Intel and AMD suddenly to wake up and give us performance SoCs, 5 years later? They've been coasting on the stone-age motherboard+cards arch because essentially Nvidia gave you a big, fat, modern coprocessor in the form of an [g/r]tx card that hid the underlying problems.

They could have done what AAPL did ages ago but they have no ability to innovate properly. They've been leaning on their x86 duopoly and if it's now on its last legs, it's their fault.

2 more replies

KingOfCoders1y ago

(Not a gamer)

For me as a developer Geekbench Clang benchmarks:

    M2 Ultra   233.9 Klines/sec
    7950x      230.3 Klines/sec  
    14900K     215.3 Klines/sec  
    M3 Max     196.5 Klines/sec

mmaniac1y ago

Even including M2 Ultra in that comparison is a bit unfair. That's a HEDT workstation CPU. The competition for this is a Threadripper.

1 more reply

andy_ppp1y ago

M4 Ultra is going to be interesting, it could be approaching 300?

re-thc1y ago

> which is still 400 points or so behind apple silicon for a much higher clock speed and a multiplier larger power draw

Not a fair comparison. If we're on about Geekbench as per the announcement, it's +35%. The 15% is a geomean. It might not be better but definitely not far off Apple.

In a similar manner, except Geekbench the geomean of M3 vs M4 isn't that great either.

fh93021y ago

+35% is for a single cherry-picked Geekbench AES subtest. AMD did not show the overall Geekbench improvement.

1 more reply

ArtTimeInvestor1y ago

Do all CPUs and GPUs these days involve components made by TSMC?

If so, is this unique - that a whole industry of relies on one company?

microtonal1y ago

Arguably that single company is ASML. There are more fabs (e.g. Intel), but AFAIK cutting-edge nodes all use ASML EUV chip fabrication machines?

Cu3PO421y ago

Intel still fabs their own CPUs, their dedicated Xe graphics are made by TSMC, though.

Nvidia 30-series was fabbed by Samsung.

So there is some competition in the high-end space, but not much. All of these companies rely on buying lithography machines from ASML, though.

Wytwwww1y ago

>Intel still fabs their own CPUs

Isn't Lunar Lake made by TSMC? Supposedly they have comparable efficiency to AMD/Apple/Qualcomm at the cost of making their fab business even less profitable

moooo991y ago

AFAIK there are some components that are made by TSMC while some others are still made at Intels own fabs

1 more reply

unwind1y ago

As far as I know, Intel is still very much a fab company.

pjc501y ago

This is probably a lot more common than you might think. How much of an "entire industry", or indeed industry as a whole, relies on Microsoft?

icf801y ago

wait till china invades

brokencode1y ago

Good thing TSMC is getting billions of dollars of government subsidies to build fabs all around the world including in the US and Japan.

preisschild1y ago

As long as Biden wins the election, they won't.

Because the US will defend Taiwan.

BeefWellington1y ago

The real problem is, any disruption of Taiwan and China's ability to produce and sell semiconductors right now is going to have wide ranging impacts on the economy.

If you thought prices were high during pandemic shortages, strap in.

It's probably not a coincidence that as soon as the US starts spending billions to onshore semiconductor production, China begins a fresh round of more concrete saber-rattling. (Yes, there's likely other factors too.)

kfrzcode1y ago

Your crystal ball is intriguing, do you have a whitepaper I can check out to understand the underlying mechanics?

gattr1y ago

I normally rebuild my workstation every ~4 years (recently more for fun than out of actual need for more processing power), might finally do it again (preferably a recent 8-/12-core Ryzen). My most recent major upgrade was in 2017 (Core i5 3570K -> Ryzen 7 1700X), with a minor fix in 2019 (Ryzen 7 2700, since 1700X was suffering from the random-segfaults-during-parallel-builds issue).

Night_Thastus1y ago

Same. I'm on a 10700k and thinking on an upgrade. I'll wait for X3D parts to come out (assuming they're doing that this gen, not sure if we've got confirmation), and compare vs 15th gen Intel once it's out in like September-ish.

tracker11y ago

Might be worth considering a Ryzen 9 5900XT (just launched as well) for a drop in upgrade. Been running a 5950X since close to launch and still pretty happy with it.

mananaysiempre1y ago

Would it really be smart to build an AM4 desktop at this point though?

tracker11y ago

I said as a "drop in upgrade"... Not sure that I'd build AM4 today, unless I had some parts used and cheap/free.

ComputerGuru1y ago

Interesting unveil; it seems everyone was expecting more significant architectural changes though they still managed to net a decent IPC improvement.

In light of the "very good but not incredible" generation-over-generation improvement, I guess we can now play the "can you get more performance for less dollars buying used last-gen HEDT or Epyc hardware or with the newest Zen 5 releases?" game (NB: not "value for your dollar" but "actually better performance").

tripdout1y ago

7900X equivalent 9900X has 120W TDP as opposed to 170W. Is it that much more power efficient?

Voultapher1y ago

That 170W TDP was chasing the benchmark crown. The 7950X lost <5% MT perf when run at 120W.

canucker20161y ago

The all-core max clock speed costs a lot of watts for that last 10% - close to 50% of total CPU consumption in some cases.

That's why undervolting has become a thing to do (unless you're an Intel CPU marketer) - give up a few percent of your all-core max clock rate and cut your wattage used by a lot.

GordonS1y ago

I wonder then if the 9900X could be undervolted without losing more than a few percentage points of speed?

sylware1y ago

Improving avx512 is really good, since this is the data unit sweet spot: a cache line is 512 bits.

But I am more interested in the cleanup of the GPU hardware interface (it should be astonishingly simple to program the GPU with its various ring buffers, like as it is rumored to be the case on nvidia side) AND in the squishing of all hardware shader bugs: look at valve ACO compiler erratas in mesa, AMD hardware shader is a bug minefield. Hopefully, the GFX12 did fix ALL KNOWN SHADER HARDWARE BUGS (sorry, ACO is written with that horrible c++, I dunno what went thru the head of valve and no, rust syntax is a complex as c++, then this is toxic too).

Sparkyte1y ago

Huge improvement over my 5k series processor.

Pet_Ant1y ago

Does anyone have metrics for some sort of benchmark per transistor? I wonder where the sweetspot is between something like these and SERV [1]. Obviously GPU wins on parallel math, but I wonder how much we are chasing in terms diminishing returns.

[1] https://www.youtube.com/watch?v=GSHasXHvZaQ

adriancr1y ago

Finally, was waiting for this for a new build and it seems decent upgrade.

nubinetwork1y ago

I'd love to replace my 2950x, but most desktop motherboards still don't support 128gb of memory.

mkl1y ago

No, a lot support 128GB. Some support 192GB or even 256GB: https://skinflint.co.uk/?cat=mbam5&xf=317_X670E&asuch=&bpmin...

lldb1y ago

You can run 4x32 DDR5 on current gen AMD consumer platforms but don't expect speeds above 4400MHz in quad-channel regardless of what the module is rated for. I'd instead suggest dual channel 48GB DIMMs for 96GB at full speed.

IamFr0ssT1y ago

4 DIMMs does not equal Quad-channel. I see in AMD's presentation Quad-channel is supported by chipsets, but I am not aware of a current AMD consumer/hedt chip with 4 memory channels.

MenhirMike1y ago

Non-Pro Threadripper 7960X/7970X/7980X are Quad Channel. (The Threadripper Pro's are Octa-Channel)

But yeah, Desktop Ryzen's are all Dual Channel.

1 more reply

erinnh1y ago

I just had a look and the comparison website I use says there are 113 motherboards that support 128GB or more memory for AM5. (in my local market. so the US probably has even more)

128GB isn't exactly a lot, so that would surprise me if it wasnt supported.

nubinetwork1y ago

I think for a while there, the only way to be able to use 128gb was to go TR4 or TRX... I kindof stopped looking for a while, but 100+ boards is certainly a nice change.

DEADMINCE1y ago

> I just had a look and the comparison website I use

Which website is that?

erinnh1y ago

geizhals.de

Very good website to compare gadgets/electronics and look for good prices.

Its for the German market though (and some small amount of Austrian shops)

1 more reply

SirGiggles1y ago

I can’t speak for all motherboards but I’m comfortably running 128GiB of DDR4 on a 5950x and x570s platform.

nubinetwork1y ago

Good to know, maybe I should look again... thanks

chainingsolid1y ago

I'm running the same as above. However do check both the memory and motherboard manufacture's tested with lists (QVL qualified vendor list) before purchasing! Each product should have its own compatablity listing.

yourusername1y ago

For AM5 most boards with 4 DIMM slots will support 128 GB, some support 192 GB.

Tepix1y ago

I had 128GB DDR4 3200 running stable on X570 with Ryzen 3000. You should be able to run 192GB these days or even 256GB (but expensive).

JonChesterfield1y ago

Give the server ones a look. 448GB here (one stick out of eight died and I have not been proactive at replacing it). Epyc comes in some SKUs that make a lot of sense for a desktop, or the seriously high spec ones from a previous generation are going cheap.

resource_waste1y ago

Kind of a shame, given how cheap 512gb is.

With LLMs, I feel like the line between consumer and professional is getting blurred.

Not that CPU is really reasonable for LLMs that big...

Tepix1y ago

It's weird that CPUs tend to use more power every two years now instead of less. Looks like this time, the TDP didn't increase. Yay! i guess. Don't most people think their PCs are fast enough already?

It's as if our planet wasn't being destroyed at a frightening speed. We're headed towards a cliff, but instead if braking, we're accelerating.

mmaniac1y ago

The people who think PCs are already fast enough don't buy CPUs every year.

A 7950X in Eco mode is ridiculously capable for the power it pulls but that's less of a selling point.

hajile1y ago

I think there's a market for a 7950E with a specific binning for lower power rather than highest possible clocks.

mmaniac1y ago

Compared to Intel, AMD seems to bin their chips a lot less. Intel have their -T chips which would be nifty if Intel weren't so far behind in terms in terms of efficiency.

adham-omran1y ago

I have a 7950X which I run in Eco mode with air cooling and it's amazingly quiet when not under load and rips when it needs to, I don't see a reason to upgrade any time soon.

dotnet001y ago

The TDP increases slower than the efficiency, so for the same workload, a newer CPU consumes similar or less power. Especially nowadays, most of these chips only get anywhere near the stated TDP when being near fully maxed out.

jiggawatts1y ago

I wonder when the PC world will transition to unified memory and more monolithic architectures as seen in the Apple M-series chips, but also in the NVIDIA GB200 "superchip", the AMD MI300 accelerator, the X-Box and Playstation consoles, and even in some mobile phone system-on-a-chip designs. It feels like the PC is the "last holdout" of discrete upgradeable components with relatively low bandwidth between them.

Sooner or later, AI will need to run on the edge, and that'll require RAM bandwidths measured in multiple terabytes per second, as well as "tensor" compute integrated closely with CPUs.

Sure, a lot of people see LLMs as "useless toys" or "overhyped" now, but people said that about the Internet too. What it took to make everything revolve around the Internet instead of it being just a fad is broadband. When everyone had fast always-on Internet at home and in their mobile devices, then nobody could argue that the Internet wasn't useful. Build it, and the products will come!

If every gaming PC had the same spec as a GB200 or MI300, then games could do real-time voice interaction with "intelligent" NPCs with low latency. You could talk to characters, and they could talk back. Not just talk, but argue, haggle, and debate!

"No, no, no, the dragon is too powerful! ... I don't care if your sword is a unique artefact, your arm is weak!"

I feel like this is the same kind of step-change as floppy drives to hard drives, or dialup or fibre. It'll take time. People will argue that "you don't need it" or "it's for enterprise use, not for consumers", but I have faster Internet going to my apartment than my entire continent had 30 years ago.

pquki41y ago

I doubt that's ever going to happen -- people build their own PCs exactly because they want the opposite of everything in a MacBook: they want modularity, upgradeability and repairability to the point that they are willing to sacrifice power efficiency.

"AI will need to run..."

Let's wait and see what actually happens to AI before being too eager to change the design of computers. I'm also pretty sure there will be a better solution than what you described.

Semaphor1y ago

> people build their own PCs exactly because they want the opposite of everything in a MacBook: they want modularity, upgradeability and repairability to the point that they are willing to sacrifice power efficiency.

While I’m totally one of those people, aren’t we some rather small minority, nowadays? I mean, obviously still big enough for companies to produce parts we want, but I always keep reading how more and more people are using laptops instead of desktops.

dotnet001y ago

These aren't laptop parts though. Maybe they'll do integrated memory for laptops, since many solder that on anyway nowadays.

But, for the desktop parts I don't see that being worthwhile unless it's used as something like with the 'X3D' chips, a large cache layer to the expandable memory.

I feel like this is nicely indicated by how AMD's desktop APUs get a lot less interest, they were fine without it for several generations, and even now it's just an afterthought.

smolder1y ago

I would prefer that we figure out how to make the massive vector blobs called AI work more efficiently, versus throwing tons of hardware at what we can barely understand in perpetuity. It doesn't sit right with me that putting lots of floating point ability in edge devices for running these approximate-at-best models is considered the way forward for computing.

ericd1y ago

Yeah, bitnets already shift those matmuls to integer math.

smolder1y ago

Thanks for turning me on to this subject. It's interesting. I took a look at this paper: https://arxiv.org/abs/2310.11453

1 more reply

aaomidi1y ago

I can’t wait for my games to start hallucinating narration.

fourfour31y ago

Exactly this.

I want plot/character driven games like RPGs to be a curated & carefully designed, plotted, and paced experience.

I want speech & narrative to be scripted! It means someone sat down and thought about the experience it produces for the player. It means a real voice actor has performed it, putting the right emotion and pacing into the line.

I don't want AI generated stilted dialogue, uncanny valley speech, etc.

And I also don't want an extra few hundred watts of power draw on gaming PCs - they're already high and in modern games the CPU is under pretty substantial load, the GPU is maxed out, and the GPU's AI/NPU style cores are being used for things like DLSS too.

Bringing in more compute resource for running speech to text, LLMs, text to speech, etc fast enough to not feel horrible is going to come at substantial power and financial cost.

nottorp1y ago

But... that would be ... "making video games". The large "video game companies" would rather "create content" and use the cheapest possible option.

mavhc1y ago

You want your NPCs to be limited in their conversations, to only follow a set few paths, if you think of a way to play the game that's not been anticipated years ago, you don't want that to be possible

1 more reply

WithinReason1y ago

This but unironically

treprinum1y ago

Isn't CAMM2 providing higher bandwidth? MOBOs with CAMM2 for AMD are already popping up.

cubefox1y ago

Problem with unified memory is that software needs to support it, so there is a chicken-egg situation. Another issue is that unified memory is usually not replaceable, just like non-unified VRAM.

bcrosby951y ago

How far back are you digging for calling the Internet a useless toy, the 70s? Profitable companies that provided internet like services existed at least as far back as the 80s.

I've seen this line pulled out before and it always seems like an assumption than actual reality.

JonChesterfield1y ago

CPU and GPU cores have been on the same package for approximately forever. I'd really like on package memory at this point, nice though upgradable memory is the performance gap it induces would be good to see gone. AMD clearly knows how to build that and Apple has thrown down the gauntlet, I just don't know when it'll happen.

Anyone know if the 'strix' apu thing is expected to be a ddr-on-package or still using with separate sticks? Search engine is not going well for me.

eviks1y ago

> Sure, a lot of people see LLMs as "useless toys" or "overhyped" now, but people said that about the Internet too.

Sure, you can always find naysayers about any tech, but we've also seen plenty of useless toys, so that internet fact doesn't help your argument that AI will come to the edge in any way (and no, email was not a fad even at dial-up speeds, so you don't even have the internet fact)

alberth1y ago

> TSMC N4

Question: Am I understanding this correctly that AMD will be using a node size from TSMC that’s 2-years old, but in a way it’s kind of older.

Because N4 was like a “N5+” (and the current gen is “N3+”).

EDIT: why the downvotes for a question?

Night_Thastus1y ago

Yes, AMD is not using the leading-edge node. The older node is cheaper/has better yield and more importantly has much larger supply than N3, which Apple has likely completely devoured.

I am personally very curious how it compares vs Intel's 15th gen, which is rumored to be on Intel 20 process.

aurareturn1y ago

15% IPC boost is mildly disappointing.

It will be significantly slower in ST than M4, and even more so against the M4 Pro/Max.

KeplerBoy1y ago

I wouldn't be sure about that. Isn't apple silicon performance mostly benchmarked using Geekbench?

AMD claims +35% IPC improvements in that specific benchmark, due to improvement in the AVX512 pipeline.

aurareturn1y ago

AMD claims +35% IPC improvement in AES subtest of Geekbench 6. It's not the entire Geekbench 6 CPU suite. It's deceptive.

Overall GB6 improvement is likely around 10-15% only because that's how much IPC improved while clock speed remains the same.

hajile1y ago

Apple themselves have around 10% IPC uplift with M4 too.

The real issue is that most code people run doesn't use very much SIMD and even less uses AVX-512.

muxator1y ago

How come is a 15% IPC increase generation for generation a disappointing result? There might be greener pastures, I agree, but a 15% increase year over year for the quality factor of a product is nothing to be disappointed of. It's good execution, even more so in a mature and competitive sector such as microelectronics.

aurareturn1y ago

It's not year over year. It's 2 years.

It's disappointing because M4 is significantly ahead. I would expect Zen to make a bigger leap to catch up.

Also, this small leap opens up for Intel's Arrow Lake to take the lead.

adrian_b1y ago

Actually from Zen 4 to Zen 5 there is a little more than one year and a half (supposing that they will indeed go on sale in July), almost matching the AMD announcement done around the launch of Zen 4 that they will shorten the time between generations from 2 years to 1 1/2 years.

Hikikomori1y ago

Process node advantage is real.

1 more reply

JonChesterfield1y ago

I wonder how much of the performance delta there is the OS. The performance benchmarks for Zen won't be running on OSX, credible risk they're running through the overhead of Windows.

papichulo20231y ago

Meh, I guess still 128 bits memory channel width. Guess they are not even trying anymore.

icf801y ago

2 x 2 x 32bit

anything else will require newer SOCKET, MB AND RAM

lmz1y ago

Is this even possible to change without moving to a new socket?

KingOfCoders1y ago

No.

j / k navigate · click thread line to collapse

292 comments

buildbot1y ago

adrian_b1y ago

I expect that this will remain true for Zen 5 and the next Intel CPUs.

Aardwolf1y ago

> There is no difference between Intel and AMD Zen 4 for most 512-bit AVX-512 instructions.

Except for the fact that Intel hasn't had any AVX-512 for years already in consumer CPUs, so there's nothing to compare against really in this target market

adrian_b1y ago

As you say, Intel has abandoned the use of the full AVX-512 instruction set in their laptop/desktop products and in some of their server products.

At the end of 2025, Intel is expected to introduce laptop/desktop CPUs that will implement a 256-bit subset of the AVX-512 instruction set.

1 more reply

tempnow9871y ago

shrubble1y ago

My view is that Intel is trying to do market segmentation. Want AVX512? You need to buy the "pro" lineup of CPUs...

4 more replies

DEADMINCE1y ago

> The only important differences in throughput between Intel and AMD

Not exactly related, but AMD also has a much better track record when it comes to speculative execution attacks.

xattt1y ago

I see the discussion of instruction fusion for AVX512 in Intel chips. Can someone explain the clock speed drop?

adrian_b1y ago

A clock speed drop must always occur whenever a CPU does more work per clock cycle, e.g. when more CPU cores are active or when wider SIMD instructions are used.

1 more reply

camel-cdr1y ago

dzaima1y ago

More importantly for the "2 cycles" question, Zen 4 can get one cycle latency for double-pumped 512-bit ops (for the ops where that's reasonable, i.e. basic integer/bitwise arith).

camel-cdr1y ago

> Having all 512-bit pipes would still be a massive throughput improvement over Zen 4 (as long as pipe count is less than halved)

What would be different between doubling tbe pipe width vs number of pipes? (excluding inter lane operations that already had their own 512-bit pipe in Zen4)

1 more reply

wren69911y ago

SomeoneFromCA1y ago

I wish it had AVX512 Fp16.

api1y ago

At what point do these become competitive with GPUs for AI cost wise if GPUs retain their nutty price premium?

bloaf1y ago

A 50% speed boost would probably make the CPU option a lot more viable for home chatbot, just due to how easy it is to make a system with 128gb RAM vs 128gb VRAM.

I personally am going to experiment with the 48gb modules in the not too distant future.

phkahler1y ago

You could put an 8700G in the same socket. The CPU isn't much faster but it has the new NPU for AI. I'm thinking about this upgrade to my 2400G but might want to wait for the new socket and DDR5.

1 more reply

cschneid1y ago

I looked at upgrading my existing AMD based system's ram for this purpose, but found out my mobo/cpu only supports 128gb of ram. Lots, but not as much as I had hoped I could shove in there.

hajile1y ago

Does Zen5 do FP math in a single cycle?

dzaima1y ago

The thing discussed is that Zen 4 does 512-bit SIMD ops via splitting them into two 256-bit ones, whereas Zen 5 supposedly will have hardware doing all 512 bits at a time.

adrian_b1y ago

Even if Lisa Su said this at the Zen 4 launch, it is not likely that 512-bit operations are split into a pair of 256-bit operantions that are executed sequentially in the same 256-bit execution unit.

Both Zen 3 and Zen 4 have four 256-bit execution units.

For such a clock frequency, the latency of floating-point execution units of acceptable complexity is between 3 and 5, while the latency of loads from the L1 cache memory is about the same.

1 more reply

dhx1y ago

How are the 24x PCIe 5.0 lanes (~90GB/s) of the 9950X allocated?

The article makes it appear as:

* 16x PCIe 5.0 lanes for "graphics use" connected directly to the 9950X (~63GB/s).

* 1x PCIe 5.0 lane for an M.2 port connected directly to the 9950X (~4GB/s). Motherboard manufacturers seemingly could repurpose "graphics use" PCIe 5.0 lanes for additional M.2 ports.

* 7x PCIe 5.0 lanes connected to the X870E chipset (~28GB/s). Used as follows:

  * 4x USB 4.0 ports connected to the X870E chipset (~8GB/s).

  * 4x PCIe 4.0 ports connected to the X870E chipset (~8GB/s).

  * 4x PCIe 3.0 ports connected to the X870E chipset (~4GB/s).

  * 8x SATA 3.0 ports connected to the X870E chipset (some >~2.4GB/s part of ~8GB/s shared with WiFi 7).

  * WiFi 7 connected to the X870E chipset (some >~1GB/s part of ~8GB/s shared with 8x SATA 3.0 ports).

wtallis1y ago

Nobody allocates a single PCIe lane to an SSD slot, and the link between the CPU and chipset must have a lane width that is a power of two; a seven-lane link is not possible with standard PCIe.

dhx1y ago

Block diagram for AM5 (X670E/X670): https://www.techpowerup.com/review/amd-ryzen-9-7950x/images/...

Block diagram for AM4 (X570): https://www.reddit.com/r/Amd/comments/bus60i/amd_x570_detail...

adrian_b1y ago

Cheap small computers with Intel Alder Lake N CPUs, like Intel N100, allocate frequently a single PCIe lane for each SSD slot.

However you are right that such a choice is very unlikely for computers using AMD CPUs or Intel Core CPUs.

adgjlsfhk11y ago

with pcie 5, 2x is almost certainly enough for all but the absolute fastest drives.

lmz1y ago

It's usually x16/x4/x4 for GPU/M.2/Chipset. You can check the diagrams from the current x670 boards for info.

wtallis1y ago

lmz1y ago

Oh you're right. I didn't read the diagram correctly and the specs say 28 total lanes, 24 usable - probably because 4 go to the chipset.

paulmd1y ago

https://www.anandtech.com/show/20057/amd-releases-epyc-8004-...

https://www.asrockrack.com/general/productdetail.asp?Model=E...

https://www.asrockrack.com/general/productdetail.asp?Model=G...

ComputerGuru1y ago

Very well said. It feels like we have all this amazing tech that could open the door to so many creative possibilities but no one is interested in exploring it.

Before, we had so little but it was all available to utilize to the fullest extent. Now we live in a world of excess but it’s almost a walled garden.

vardump1y ago

I think that decision is ultimately made by the mainboard vendor.

irusensei1y ago

Should had go with a proper PC. Not doing this mistake anymore.

isoos1y ago

It sounds like you need a desktop workstation with replaceable extension cards, and not a mostly immutable laptop, which has different strengths.

irusensei1y ago

Agreed but it will need to wait for now.

worthless-trash1y ago

You will find that this is the cost of any laptop, any time you buy it there is always new tech around the corner and there isn't much you can do about it.

simcop23871y ago

(disclosure I own a 13in one)

zamadatix1y ago

kmfrk1y ago

Waiting for CAMM2 to get wider adoption could be interesting:

https://x.com/msigaming/status/1793628162334621754

Hopefully won't be too long now.

Delmololo1y ago

Either it was a shitty investment from the beginning or you actually use it very regularly and it would be worth it anyway to slowly thinking about something new.

gautamcgoel1y ago

mrweasel1y ago

> AMD isn't using TSMC's latest node

arvinsim1y ago

TBH, CPUs nowadays are mostly good enough for the consumer, even at mid or low tiers.

It's the GPUs that are just getting increasing inaccessible, price wise.

michaelt1y ago

Yes - with more and more users moving to laptops and wanting a longer battery life, raw peak performance hasn't moved much in a decade.

A decade ago, Steam's hardware survey said 8GB was the most popular amount of RAM [1] and today, the latest $1600 Macbook Pro comes with.... 8GB of RAM.

In some ways that's been a good thing - it used to be that software got more and more featureful/bloated and you needed a new computer every 3-5 years just to keep up.

[1] https://web.archive.org/web/20140228170316/http://store.stea...

vitus1y ago

> raw peak performance hasn't moved much in a decade.

In general, CPU clock speeds stagnated about 20 years ago because we hit a power wall.

In 1985, the state of the art was maybe 15-20MHz; in 1995, that was 300-500MHz; in 2005, we hit about 3GHz and we've made incremental progress from there.

Most of the advancements over the past 20 years have come from pipelining, increased parallelism, and changes further up the memory hierarchy.

> today, the latest $1600 Macbook Pro comes with.... 8GB of RAM.

1 more reply

antisthenes1y ago

> it used to be that software got more and more featureful/bloated and you needed a new computer every 3-5 years just to keep up.

1 more reply

CivBase1y ago

[1] https://store.steampowered.com/hwsurvey/Steam-Hardware-Softw...

2 more replies

faeriechangling1y ago

When the industry moves to lpddr6/ddr6 I wouldn’t be shocked to see an increase to 6gb per module standard although maybe some binned 4gb modules will still be sold.

MenhirMike1y ago

I guess you _can_ game on those 2 CU GPUs, but it really doesn't seem to be intended for that.

spixy1y ago

Better efficiency with three external 4K 120Hz monitors?

MenhirMike1y ago

So I was curious if there was anything else that RDNA 3/3.5 would offer over RDNA 2 in such a low end configuration.

1 more reply

cschneid1y ago

philsnow1y ago

So yeah next time I build a machine I'll appreciate having this built in.

1 more reply

AlfeG1y ago

> DDR5 speeds up to 56000MHz (yes, I know that you can use faster RAM)

Not sure that I actually CAN. 56 GHz is already a lot.

gautamcgoel1y ago

Fixed, thanks :)

orphea1y ago

5600 are MT/s (megatransfers/second), not MHz ;)

re-thc1y ago

> The CPUs are also using the previous-gen graphics architecture, RDNA2

Faster GPU is reserved for APUs. These graphics are just here for basic support.

diffeomorphism1y ago

Nah, you can get RDNA3.5 if you want to (not sure why you want that in a (home)server though)

https://www.anandtech.com/show/21419/amd-announces-the-ryzen...

Arrath1y ago

Well perhaps I will stop holding out and just get the 7800x3d, if the 9000 generation won't be too terribly groundbreaking.

doikor1y ago

> The CPUs are also using the previous-gen graphics architecture, RDNA2.

matharmin1y ago

These are decent GPUs for anything other than heavy gaming. I'm driving two 4k screens with it, and even for some light gaming (such as factorio) it's completely fine.

123yawaworht4561y ago

1 more reply

user_78321y ago

2 more replies

moooo991y ago

nickjj1y ago

Another advantage of having an integrated GPU is you can do a GPU pass-through and let a VM directly and fully use your dedicated GPU.

This could be a thing if you're running native Linux but some games only work on Windows which you run in a VM instead of dual booting.

ffsm81y ago

I have to disagree. They work great for video playback and office work. So media server, and workstations are fine without a dedicated gpu

ubercore1y ago

> The GPU on these parts is there mostly for being able to boot into BIOS or OS for debugging.

That's wildly not true. Transcoding, gaming, multiple displays, etc. They are often used as any other GPU would be used.

TacticalCoder1y ago

> The GPU on these parts is there mostly for being able to boot into BIOS or OS for debugging.

Not at all. I drive a 38" monitor with the iGPU of the 7700X. If you don't game and don't run local AI models it's totally fine.

And... No additional GPU fans.

My 7700X build is so quiet it's nearly silent. I can barely hear it's Noctua NH-12S cooler/fan ramping up when under full load and that's how it should be.

JonChesterfield1y ago

They also mean you can drive monitors using the builtin GPU while using dedicated ones for compute.

btgeekboy1y ago

aurareturn1y ago

It's because x86 chips are no longer leading in the client. ARM chips are. Specifically, Apple chips. Though Qualcomm has huge potential leapfrog AMD/Intel chips in a few generations too.

KingOfCoders1y ago

[If you're a laptop user, scroll down the thread for laptop Rust compile times, M3 Pro looks great]

You're misguided.

Apple has excellent Notebook CPUs. Apple has great IPC. But AMD and Intel have easily faster CPUs.

https://opendata.blender.org/benchmarks/query/?compute_type=...

Blender Benchmark

      AMD Ryzen 9 7950X (16 core)         560.8
      Apple M2 Ultra (24 cores)           501.82
      Apple M3 Max (12 cores)             408.27
      Apple M3 Pro                        226.46
      Apple M3                            160.58

It depends on what you're doing.

I'm a software developer using a compiler that 100%s all cores. I like fast multicore.

      Apple Mac Pro, 64gb, M2 Ultra, $7000
      Apple Mac mini, 32gb, M2 Pro, 2TB SSD, $2600

[Edit2] Compare to: 7950x is $500 and a very fast SSD is $400, fast 64gb is $200, very good board is $400 so I get a very fast dev machine for ~$1700 (0,329 p/$ vs. mini 0,077 p/$)

[Edit] Made a c&p mistake, the mini has no ultra.

aurareturn1y ago

That seems wrong.

Though Blender may have an optimization for avx512 but not for SME or Neon.

But the vast majority will use GPUs to do rendering for Blender.

Try SPEC or its close consumer counterpart, Geekbench.

As an anecdote, all my Python and Node.js applications run faster on Apple Silicon than Zen4. Even my multithread Go apps seem to run better on Apple Silicon.

4 more replies

hajile1y ago

There's something wrong with your M3 Max stuff. I believe it comes in 14 and 16-core variants while the M3 Pro comes in 11 and 12-core variants.

In any case, M3 Max uses less than 55w of power in CPU-only workloads while a desktop 7950x peaked out at 332w of power according to Guru3D (without an OC).

The fact that M2 Ultra hits so close while peaking out at only around 100w of CPU power is pretty crazy (M2 Ultra doesn't even hit 300w with all CPU and GPU cores maxed out).

2 more replies

wtallis1y ago

> Blender Benchmark

> Apple Mac mini, 32gb, M2 Ultra, 2TB SSD, $2600

Not a real thing. You meant M2 Pro, because the Max and Ultra chips aren't available in the Mac mini.

2 more replies

krasin1y ago

Until ARM has a proper UEFI support, it's not a practical desktop/server with a few notable exceptions (Mac, Raspberry Pi) and only because there's so much support from the respective vendors.

walterbell1y ago

bitwize1y ago

It's like I keep saying: the first Chinese manufacturer to churn out cheap SBCs with ServerReady support will make a killing as a true Pi killer. Anyone? Anyone? Pine64? Pine64?

1 more reply

craftkiller1y ago

ARM needs more than just proper UEFI support: Microsoft needs to lift the secureboot restrictions on ARM.

x86: Microsoft requires that end-users are allowed to disable secure boot and control which keys are used.

arm: Microsoft requires that end-users are not allowed to disable secure boot

source: https://mjg59.dreamwidth.org/23817.html

mmaniac1y ago

The mobile APUs are way more interesting.

TacticalCoder1y ago

> 15% overall is good but nothing earth shattering

So it looks like it's 15% overall more power with less power consumption.

hajile1y ago

ComputerGuru1y ago

I know AMD has joined the “fudging the TDP numbers” game but you should still be able to more or less compare TDP across their own products.

2 more replies

vegabook1y ago

mmaniac1y ago

Strix Halo appears to be AMD's competitor to Apple SoCs which will feature a much bigger iGP and much greater memory bandwidth. When we hear more about that, comparisons will be apt.

aurareturn1y ago

Ryzen doesn't even lead in MT on laptops.

With M4, they're likely to fall even farther behind. M4 Pro/Max is likely to arrive in Fall. AMD's Strix Point doesn't seem to have a release date.

1 more reply

vegabook1y ago

They could have done what AAPL did ages ago but they have no ability to innovate properly. They've been leaning on their x86 duopoly and if it's now on its last legs, it's their fault.

2 more replies

KingOfCoders1y ago

(Not a gamer)

For me as a developer Geekbench Clang benchmarks:

    M2 Ultra   233.9 Klines/sec
    7950x      230.3 Klines/sec  
    14900K     215.3 Klines/sec  
    M3 Max     196.5 Klines/sec

mmaniac1y ago

Even including M2 Ultra in that comparison is a bit unfair. That's a HEDT workstation CPU. The competition for this is a Threadripper.

1 more reply

andy_ppp1y ago

M4 Ultra is going to be interesting, it could be approaching 300?

re-thc1y ago

> which is still 400 points or so behind apple silicon for a much higher clock speed and a multiplier larger power draw

Not a fair comparison. If we're on about Geekbench as per the announcement, it's +35%. The 15% is a geomean. It might not be better but definitely not far off Apple.

In a similar manner, except Geekbench the geomean of M3 vs M4 isn't that great either.

fh93021y ago

+35% is for a single cherry-picked Geekbench AES subtest. AMD did not show the overall Geekbench improvement.

1 more reply

ArtTimeInvestor1y ago

Do all CPUs and GPUs these days involve components made by TSMC?

If so, is this unique - that a whole industry of relies on one company?

microtonal1y ago

Arguably that single company is ASML. There are more fabs (e.g. Intel), but AFAIK cutting-edge nodes all use ASML EUV chip fabrication machines?

Cu3PO421y ago

Intel still fabs their own CPUs, their dedicated Xe graphics are made by TSMC, though.

Nvidia 30-series was fabbed by Samsung.

So there is some competition in the high-end space, but not much. All of these companies rely on buying lithography machines from ASML, though.

Wytwwww1y ago

>Intel still fabs their own CPUs

Isn't Lunar Lake made by TSMC? Supposedly they have comparable efficiency to AMD/Apple/Qualcomm at the cost of making their fab business even less profitable

moooo991y ago

AFAIK there are some components that are made by TSMC while some others are still made at Intels own fabs

1 more reply

unwind1y ago

As far as I know, Intel is still very much a fab company.

pjc501y ago

This is probably a lot more common than you might think. How much of an "entire industry", or indeed industry as a whole, relies on Microsoft?

icf801y ago

wait till china invades

brokencode1y ago

Good thing TSMC is getting billions of dollars of government subsidies to build fabs all around the world including in the US and Japan.

preisschild1y ago

As long as Biden wins the election, they won't.

Because the US will defend Taiwan.

BeefWellington1y ago

The real problem is, any disruption of Taiwan and China's ability to produce and sell semiconductors right now is going to have wide ranging impacts on the economy.

If you thought prices were high during pandemic shortages, strap in.

kfrzcode1y ago

Your crystal ball is intriguing, do you have a whitepaper I can check out to understand the underlying mechanics?

gattr1y ago

Night_Thastus1y ago

tracker11y ago

Might be worth considering a Ryzen 9 5900XT (just launched as well) for a drop in upgrade. Been running a 5950X since close to launch and still pretty happy with it.

mananaysiempre1y ago

Would it really be smart to build an AM4 desktop at this point though?

tracker11y ago

I said as a "drop in upgrade"... Not sure that I'd build AM4 today, unless I had some parts used and cheap/free.

ComputerGuru1y ago

Interesting unveil; it seems everyone was expecting more significant architectural changes though they still managed to net a decent IPC improvement.

tripdout1y ago

7900X equivalent 9900X has 120W TDP as opposed to 170W. Is it that much more power efficient?

Voultapher1y ago

That 170W TDP was chasing the benchmark crown. The 7950X lost <5% MT perf when run at 120W.

canucker20161y ago

The all-core max clock speed costs a lot of watts for that last 10% - close to 50% of total CPU consumption in some cases.

That's why undervolting has become a thing to do (unless you're an Intel CPU marketer) - give up a few percent of your all-core max clock rate and cut your wattage used by a lot.

GordonS1y ago

I wonder then if the 9900X could be undervolted without losing more than a few percentage points of speed?

sylware1y ago

Improving avx512 is really good, since this is the data unit sweet spot: a cache line is 512 bits.

Sparkyte1y ago

Huge improvement over my 5k series processor.

Pet_Ant1y ago

[1] https://www.youtube.com/watch?v=GSHasXHvZaQ

adriancr1y ago

Finally, was waiting for this for a new build and it seems decent upgrade.

nubinetwork1y ago

I'd love to replace my 2950x, but most desktop motherboards still don't support 128gb of memory.

mkl1y ago

No, a lot support 128GB. Some support 192GB or even 256GB: https://skinflint.co.uk/?cat=mbam5&xf=317_X670E&asuch=&bpmin...

lldb1y ago

IamFr0ssT1y ago

4 DIMMs does not equal Quad-channel. I see in AMD's presentation Quad-channel is supported by chipsets, but I am not aware of a current AMD consumer/hedt chip with 4 memory channels.

MenhirMike1y ago

Non-Pro Threadripper 7960X/7970X/7980X are Quad Channel. (The Threadripper Pro's are Octa-Channel)

But yeah, Desktop Ryzen's are all Dual Channel.

1 more reply

erinnh1y ago

I just had a look and the comparison website I use says there are 113 motherboards that support 128GB or more memory for AM5. (in my local market. so the US probably has even more)

128GB isn't exactly a lot, so that would surprise me if it wasnt supported.

nubinetwork1y ago

I think for a while there, the only way to be able to use 128gb was to go TR4 or TRX... I kindof stopped looking for a while, but 100+ boards is certainly a nice change.

DEADMINCE1y ago

> I just had a look and the comparison website I use

Which website is that?

erinnh1y ago

geizhals.de

Very good website to compare gadgets/electronics and look for good prices.

Its for the German market though (and some small amount of Austrian shops)

1 more reply

SirGiggles1y ago

I can’t speak for all motherboards but I’m comfortably running 128GiB of DDR4 on a 5950x and x570s platform.

nubinetwork1y ago

Good to know, maybe I should look again... thanks

chainingsolid1y ago

yourusername1y ago

For AM5 most boards with 4 DIMM slots will support 128 GB, some support 192 GB.

Tepix1y ago

I had 128GB DDR4 3200 running stable on X570 with Ryzen 3000. You should be able to run 192GB these days or even 256GB (but expensive).

JonChesterfield1y ago

resource_waste1y ago

Kind of a shame, given how cheap 512gb is.

With LLMs, I feel like the line between consumer and professional is getting blurred.

Not that CPU is really reasonable for LLMs that big...

Tepix1y ago

It's as if our planet wasn't being destroyed at a frightening speed. We're headed towards a cliff, but instead if braking, we're accelerating.

mmaniac1y ago

The people who think PCs are already fast enough don't buy CPUs every year.

A 7950X in Eco mode is ridiculously capable for the power it pulls but that's less of a selling point.

hajile1y ago

I think there's a market for a 7950E with a specific binning for lower power rather than highest possible clocks.

mmaniac1y ago

Compared to Intel, AMD seems to bin their chips a lot less. Intel have their -T chips which would be nifty if Intel weren't so far behind in terms in terms of efficiency.

adham-omran1y ago

I have a 7950X which I run in Eco mode with air cooling and it's amazingly quiet when not under load and rips when it needs to, I don't see a reason to upgrade any time soon.

dotnet001y ago

jiggawatts1y ago

Sooner or later, AI will need to run on the edge, and that'll require RAM bandwidths measured in multiple terabytes per second, as well as "tensor" compute integrated closely with CPUs.

"No, no, no, the dragon is too powerful! ... I don't care if your sword is a unique artefact, your arm is weak!"

pquki41y ago

"AI will need to run..."

Let's wait and see what actually happens to AI before being too eager to change the design of computers. I'm also pretty sure there will be a better solution than what you described.

Semaphor1y ago

dotnet001y ago

These aren't laptop parts though. Maybe they'll do integrated memory for laptops, since many solder that on anyway nowadays.

But, for the desktop parts I don't see that being worthwhile unless it's used as something like with the 'X3D' chips, a large cache layer to the expandable memory.

I feel like this is nicely indicated by how AMD's desktop APUs get a lot less interest, they were fine without it for several generations, and even now it's just an afterthought.

smolder1y ago

ericd1y ago

Yeah, bitnets already shift those matmuls to integer math.

smolder1y ago

Thanks for turning me on to this subject. It's interesting. I took a look at this paper: https://arxiv.org/abs/2310.11453

1 more reply

aaomidi1y ago

I can’t wait for my games to start hallucinating narration.

fourfour31y ago

Exactly this.

I want plot/character driven games like RPGs to be a curated & carefully designed, plotted, and paced experience.

I don't want AI generated stilted dialogue, uncanny valley speech, etc.

Bringing in more compute resource for running speech to text, LLMs, text to speech, etc fast enough to not feel horrible is going to come at substantial power and financial cost.

nottorp1y ago

But... that would be ... "making video games". The large "video game companies" would rather "create content" and use the cheapest possible option.

mavhc1y ago

1 more reply

WithinReason1y ago

This but unironically

treprinum1y ago

Isn't CAMM2 providing higher bandwidth? MOBOs with CAMM2 for AMD are already popping up.

cubefox1y ago

Problem with unified memory is that software needs to support it, so there is a chicken-egg situation. Another issue is that unified memory is usually not replaceable, just like non-unified VRAM.

bcrosby951y ago

How far back are you digging for calling the Internet a useless toy, the 70s? Profitable companies that provided internet like services existed at least as far back as the 80s.

I've seen this line pulled out before and it always seems like an assumption than actual reality.

JonChesterfield1y ago

Anyone know if the 'strix' apu thing is expected to be a ddr-on-package or still using with separate sticks? Search engine is not going well for me.

eviks1y ago

> Sure, a lot of people see LLMs as "useless toys" or "overhyped" now, but people said that about the Internet too.

alberth1y ago

> TSMC N4

Question: Am I understanding this correctly that AMD will be using a node size from TSMC that’s 2-years old, but in a way it’s kind of older.

Because N4 was like a “N5+” (and the current gen is “N3+”).

EDIT: why the downvotes for a question?

Night_Thastus1y ago

Yes, AMD is not using the leading-edge node. The older node is cheaper/has better yield and more importantly has much larger supply than N3, which Apple has likely completely devoured.

I am personally very curious how it compares vs Intel's 15th gen, which is rumored to be on Intel 20 process.

aurareturn1y ago

15% IPC boost is mildly disappointing.

It will be significantly slower in ST than M4, and even more so against the M4 Pro/Max.

KeplerBoy1y ago

I wouldn't be sure about that. Isn't apple silicon performance mostly benchmarked using Geekbench?

AMD claims +35% IPC improvements in that specific benchmark, due to improvement in the AVX512 pipeline.

aurareturn1y ago

AMD claims +35% IPC improvement in AES subtest of Geekbench 6. It's not the entire Geekbench 6 CPU suite. It's deceptive.

Overall GB6 improvement is likely around 10-15% only because that's how much IPC improved while clock speed remains the same.

hajile1y ago

Apple themselves have around 10% IPC uplift with M4 too.

The real issue is that most code people run doesn't use very much SIMD and even less uses AVX-512.

muxator1y ago

aurareturn1y ago

It's not year over year. It's 2 years.

It's disappointing because M4 is significantly ahead. I would expect Zen to make a bigger leap to catch up.

Also, this small leap opens up for Intel's Arrow Lake to take the lead.

adrian_b1y ago

Hikikomori1y ago

Process node advantage is real.

1 more reply

JonChesterfield1y ago

I wonder how much of the performance delta there is the OS. The performance benchmarks for Zen won't be running on OSX, credible risk they're running through the overhead of Windows.

papichulo20231y ago

Meh, I guess still 128 bits memory channel width. Guess they are not even trying anymore.

icf801y ago

2 x 2 x 32bit

anything else will require newer SOCKET, MB AND RAM

lmz1y ago

Is this even possible to change without moving to a new socket?

KingOfCoders1y ago

No.

j / k navigate · click thread line to collapse