It has 10 big Cortex-X925 cores, which are competitive with the Intel P-cores and with the AMD Zen cores, plus 10 small Cortex-A725 cores, which are similar in performance with the older Intel E-cores, from the Meteor Lake, Raptor Lake and Alder Lake generations. The current Intel E-cores are similar to Cortex-X4, i.e. they are much faster.
This Arm based CPU is more powerful that any Arm-based CPU previously used in a non-Apple PC, but in multi-threaded applications it is inferior to AMD Strix Halo CPUs.
The GPU of this is different from that of DGX, which was good only at ML/AI, but poor for graphics.
Here the GPU is likely to be good for graphics, and the top model will have up to 6144 FP32 execution units compared to 2560 of Strix Halo. But I assume that at least the top models will also be much more expensive than Strix Halo.
This NVIDIA CPU+GPU is limited to 128 GB of DRAM, while the successor of Strix Halo, which has been announced recently, offers up to 192 GB of DRAM, so NVIDIA continues its tradition of always providing less memory than its competitors, in order to have better profit margins.
“The RTX Spark superchip features an NVIDIA Blackwell RTX GPU with 6,144 CUDA cores and fifth-generation Tensor Cores with FP4 precision, connected via the NVIDIA NVLink®-C2C chip-to-chip interconnect to a high-performance, 20-core NVIDIA Grace™ CPU.
MediaTek, a market leader in Arm-based system-on-a-chip designs, collaborated with NVIDIA on the custom CPU design, contributing to its best-in-class power efficiency, performance and connectivity.“
https://nvidianews.nvidia.com/news/nvidia-microsoft-windows-...
Nvidia Grace is an ARM core.
And Mediatek? Oof. I assume the SOC comes pre compromised out of the box.
It has 6,144 CUDA cores is similar to a RTX 4070 (5,888) but a lot less than a 4090 (16,384), but what it does have is support for FP4.
When they claim "1 Petaflop AI compute", thats what they mean. For comparison, a RTX 4090 has ~1.3 Petaflops of FP8 processing.
The second big deal is the NVLink-C2C interconnect, which provides up to 900 GB/s of bidirectional bandwidth between GPU and CPU. For comparison, the Apple M4 has 120 GB/s and the M3 Ultra has 819 GB/s. Notably, the Apple M series does not have FP4 support, so this could mean a significant performance improvement over Apple's offerings.
Considering how much Valve invested into ARM emulation, it's quite possible the next Steam Deck/handheld will use a variation of this (or at least there will be one using this as the SoC).
"A powerful new chapter"
"A major leap in graphics, performance, and usability"
"The most secure version of Windows ever"
"The most productive Windows ever"
"The Most Powerful Operating System Ever"
which is terrifying
“Top agentic and AI developer workloads like GitHub Copilot, Claude Code, ComfyUI, Cursor and more now run across all modern PC silicon – making Windows the ideal platform for AI-assisted development”
None of these tools make use of local GPU whatsoever.
You can use local models with most of those, you’d just need to get something like Ollama, LM Studio or vLLM working - but that’s the boring and not flashy part, it’s probably easier to name the dev tools.
Copilot: https://docs.github.com/en/copilot/how-tos/copilot-cli/custo... (also worked in Visual Studio Code)
Claude Code: https://code.claude.com/docs/en/model-config (here’s an example of a custom config in the wild with DeepSeek, same principle for local connection https://api-docs.deepseek.com/guides/agent_integrations/clau...)
Cursor: couldn’t find official docs but here’s z.AI docs https://docs.z.ai/devpack/tool/cursor (same principle for local API)
Nitpick, but I remember running it specifically because there was a way to run LLMs on local hardware.
Thing is, hardly any games are optimized for ARM. And no serious AI development occurs on Windows.
I understand that you want multiple models running concurrently, but then 128GB is starting to look cramped.
It's a bold move that goes one to one against Apple and AMD Strix Halo.
I'm looking at it and thinking, if it can run Linux at a fair price it could be great.
I'm curious. I am thinking, what does a non developer buy this thing, take it home, and do with it.
What does the unboxing and first 24 hours look like?
A mixture of emotions somewhere between thinking you are living in the future, and frustration at not actually being able to do much.
The target audience seems like wealthy early adopters, but that is about it.
I guess we shall see.
Indeed. This NVIDIA CPU will have a lifetime overlapping with the successor of Strix Halo, which has been announced recently and which increases the maximum amount of DRAM to 192 GB.
The Strix Halo CPU has better multi-threaded performance than this, so the only advantage of NVIDIA is that the top variants will have a bigger GPU than Strix Halo and its successor, but I assume that the variants with a bigger GPU will also be much more expensive.
I imagine it’s for AI researchers, professionals who work with models. Software engineers who want local models instead of cloud models.
To be honest, I don’t think it’ll be very popular with those demographics. But I think a company like Microsoft investing in local AI is a good thing.
1: https://www.theverge.com/tech/940584/microsoft-surface-lapto...