undefined | Better HN

0 pointsq7xvh97o2pDhNrh3y ago0 comments

Do you happen to know where Apple's integrated approach falls on this spectrum?

I was actually wondering about this the other day. A fully maxed out Mac Studio is about $6K, and it comes with a "64-core GPU" and "128GB integrated memory" (whatever any of that means). Would that be enough to run a decent Llama?

0 comments

2 comments · 2 top-level

smoldesu3y ago

It's certainly enough to run a decent Llama, but hardly the most cost-effective. Apple's approach falls between the low-bandwith Intel/AMD laptops and the high-bandwith PCIe HPC components. In a way it's trapped between two markets - ultra-cheap Android/Windows hardware with 4-8gb of RAM that can still do AI inferencing, and ultra-expensive GPGPU setups that are designed to melt these workloads.

The genial thing to say is that it performs very favorable against other consumer inferencing hardware. The numbers get ugly fast once you start throwing money at the problem, though.

cudder3y ago

The Mac's "integrated memory" means it's shared between the CPU and GPU. So the GPU can address all of that and you can load giant (by current consumer GPU standards) models. I have no idea how it actually performs though.

j / k navigate · click thread line to collapse