Key Takeaways

  • NPUs are efficient for AI workloads, serving a different purpose than GPUs.
  • NPU advantages include lower latency, specialized memory hierarchies, and reduced power consumption.
  • GPUs and NPUs can coexist, and there’s no reason AI workloads can’t run on GPUs as well.



Given that Copilot+ will run on NPUs capable of 40 TOPS and above but not your RTX 4090-powered gaming rig, why is it that we need NPUs in the first place if a GPU is capable of so much more? While it’s still overall utterly ridiculous that Copilot+ won’t run on a 4090 but will on an NPU, there are some benefits to NPUs that make them viable developments to include in so-called “AI PCs.”

Related

Your RTX 4090 still won’t get you Copilot+

Copilot+ requires an NPU with 40TOPS

NPUs are worth investing in, and they serve a different purpose from GPUs

For normal PC users, it won’t matter though

Surface Laptop Studio 2 (25) NPU-1


The big advantage of NPUs for AI workloads is their efficiency, and they require significantly fewer resources than a GPU to not only run, but even include in a computer in the first place. Especially when considering the power profile of a GPU in a laptop, the inclusion of an NPU for running those particularly heavy AI workloads makes sense. There are also a ton of things GPUs do or are good at that won’t benefit AI.

NPUs are specifically geared towards matrix multiplications and convolutions, which means they can execute those calculations more efficiently while still maintaining the parallelism that makes GPUs good at AI. They’re also optimized for data flow in a way that integrates them to the CPU so that they can be physically closer to the data being operated on. This means that NPUs can retrieve data with lower latency.


On top of parallelism, NPUs incorporate specialized memory hierarchies. This may mean that there is on-chip SRAM, which is faster, low-latency memory for storing intermediate data, weights, and activations during neural network processing. This has benefits as keeping frequently accessed data on-chip means that NPUs minimize the need to access slower, off-chip DRAM, significantly reducing latency and power consumption.

All of these things make NPUs worth their while investing in, even if they’re still worse at AI than a conventional high-end consumer GPU.

Related

What are AI TOPS? Explaining a largely meaningless term

If you’ve seen companies marketing based on “AI TOPS”, here’s what that means and why it’s largely meaningless.

Still, the option should be there to run AI on your GPU

NPUs can still serve a purpose while acknowledging that GPUs are good too

nvidia geforce rtx 4080 super fe seen in the shipping box it came in


What I don’t understand about the NPU craze on laptops and PCs is why it’s insisted that those workloads have to run on an NPU. There’s nothing preventing Microsoft’s Recall (rest in peace, temporarily) from running on an Nvidia GPU or an AMD GPU, and the specific TOPS requirement of the NPU doesn’t make sense either as a true limitation.

As we already talked about, TOPS numbers are meaningless. If you look up the specs for an Nvidia GeForce RTX 4090, you’ll find that it’s capable of 1,321 “AI TOPS.” Not all TOPS are created equal, but with a number that high, it doesn’t really matter if it’s measured with the INT8 precision that Nvidia’s competitors are using. It’s clear that it smokes the competition, regardless of whether that number is INT4, INT8, or anything else.


NPUs have their place in the industry, and that place is distinct from GPUs. They can absolutely coexist, but preventing NPU workloads from being offloaded to a GPU makes little sense, both from a technological standpoint and a consumer standpoint. From the outside, it looks like a gated exclusivity to promote consumer buy-in for new AI PCs and nothing more, because there’s little reason that Recall and other Copilot+ features can’t run on the best GPUs of today.

To sum things up, NPUs are powerful and have their place in the industry, but they’re not exactly irreplaceable. A GPU will do perfectly fine, and it’s power usage that primarily drives NPU innovation at present. Maybe in the future they’ll do better than GPUs will, but if you have a PC with a powerful GPU, your PC is already capable of more than a current-gen NPU can bring.


Leave a Reply

Your email address will not be published. Required fields are marked *