silicon taken up that couldve been used for a few more compute units on the GPU, which is often faster at inference anyway and way more useful/flexible/programmable/documented.
The thing is, when you get an Apple product and you take a picture, those devices are performing ML tasks while sipping battery life.
Microsoft maybe shouldn’t be chasing Apple especially since they don’t actually have any marketshare in tablets or phones, but I see where they’re getting at: they are probably tired of their OS living on devices that get half the battery life of their main competition.
And here’s the thing, Qualcomm’s solution blows Intel out of the water. The only reason not to use it is because Microsoft can’t provide the level of architecture transition that Apple does. Apple can get 100% of their users to switch architecture in about 7 years whenever they want.
When you use an Apple device, it’s performing ML tasks while barely using any battery life. That’s the whole point of the NPU. It’s not there to outperform the GPU.
The Core Ultra lineup is supposed to be low-power, low-heat, right? If you want more compute power, pick something from a different product series.
I think that "dark silicon" mentality is mostly lingering trauma from when the industry first hit a wall with the end of Dennard scaling. These days, it's quite clear that you can have a chip that's more or less fully utilized, certainly with no "dark" blocks that are as large as a NPU. You just need to have the ability to run the chip at lower clock speeds to stay within power and thermal constraints—something that was not well-developed in 2005's processors. For the kind of parallel compute that GPUs and NPUs tackle, adding more cores but running them at lower clock speeds and lower voltages usually does result in better efficiency in practice.
The real answer to the GPU vs NPU question isn't that the GPU couldn't grow, but that the NPU has a drastically different architecture making very different power vs performance tradeoffs that theoretically give it a niche of use cases where the NPU is a better choice than the GPU for some inference tasks.