You want to use an NVidia GPU for LLM ? just buy a basic PC on second hand (the GPU is the primary cost anyway), you want to use Mac for good amount of VRAM ? Buy a Mac.
With this proposed solution you have an half-backed system, the GPU is limited by the Thunderbolt port and you don’t have access to all of NVidia tool and library, and on other hand you have a system who doesn’t have the integration of native solution like MLX and a risk of breakage in future macOS update.
The software stack has been ready for Apple Silicon for more than a half decade.
> the hardware wasn't usable on macOS
This eGPU thing is from a third-party if I understand correctly. I don't see why nvidia would get excited about that. If they cared about the platform, they would have released something already.
If a model can run on a 512GB M3 Ultra via MLX or CUDA, but simultaneously benefit from the memory bandwidth of something like an RTX 6000 Pro; that would save my company hundreds of thousands of dollars. That's $20,000 for roughly 600GB of VRAM, and enough token generation speed to fulfill the needs of any enterprise that's not a hyperscaler or neocloud.
I'll let someone else do the math for you on what it costs to put together a 10U server to get that kind of performance without the $10K M3 Ultra Studio.
What we're paying for five old 80GB A100s is criminal, but it's nothing compared to what these GB200 Blackwell setups are going to cost in 2030. Market economics aside, the fact that they require sophisticated liquid cooling infrastructure and draw 3x the power of the A100s, will make these cards unattainable for small to medium organizations.
So yeah, if there's some outside chance that we can pair NVIDIA's speed with a an arm-powered machine that offers 512GB Unified Memory while drawing 50W -- you better believe it's a big deal. We'll see. Sounds too good to be true.
Yes, for many scenarios this is "not even an academic exercise".
For a very select few applications this is Gold. Finally serious linear algebra crunch for the taking. (Without custom GPU tapeout.)
Not everything is limited by the transfer speed to/from the GPU. LLM inference, for example.
I thought Thunderbolt was like pluggable PCI? The whole point was not to limit peripherals.
he actually did reply weeks later and said "i didnt realize people wanted this, my team has added them. go check now". pretty sure that was the last time nvidia drivers came to macos.
there's a lot of assumptions made with this topic, particularly the assumption that apple is blocking them. at least in my experience the opposite was true, nvidia just flat out wasn't making them. however i don't doubt the truth lies somewhere in between: nvidia and apple have a pretty much nonexistant relationship now. i dont know whats required here but i also don't doubt apple makes this experience suck butt for any interested parties.
Apple has a monopoly over the "M-chip" personal computer market. They have a monopoly over the iOS market with the app store. They have a monopoly over the driver market on macOS.
Like, Microsoft was found guilty of exploiting its monopoly for installing IE by default while still allowing other browser engines. On iOS, apple bundles safari by default and doesn't allow other browser engines.
If we apply the same standard that found MS a monopoly in the past, then Apple is obviously a monopoly, so at the very least I think it's fair to say that reasonable people can disagree about whether Apple is a monopoly or not.
[0]: https://en.wikipedia.org/wiki/United_States_v._Microsoft_Cor....
When a company is deemed an illegal monopoly, the DoJ basically becomes part of management. Antitrust settlements focus on germane elements, e.g. spin offs. But they also frequently include random terms of political convenience.
I don’t think we want a precedent where companies having a product means they have an automatic monopoly on said product.
But the M series are an Apple product line designed by Apple with a ARM license and produced on contract by TSMC for use in other Apple products.
Don’t assume the facts from another case automatically apply in other cases.
Or as Justice Jackson once put it: “Other cases presenting different allegations and different records may lead to different conclusions”
Intel sold chips to anyone. Anyone could make Intel computers.
Apple does not sell chips to anyone. Nobody else can make m-series computers.
Your argument is basically that Ford has a monopoly on selling mustangs because standard oil had a monopoly on selling oil.
lmao what ? the "M-chip" is literally their chip that they designed, built relationships with TSMC over and bankrolled into production to put in their products. literally hardware by apple for apple. this was a decade plus long thing in the making, this is the risk/gamble apple took and invested heavily into. that is apples innovation. any other manuf is free to go do this themselves for their own devices, they just didn't and for the most part still don't. that just like isn't a monopoly at all, i'm amused you even got to that point in the first place. seems to carry some broad misunderstandings of what the M-series chips are or carries an assumption that cpus are supposed to be shared to any interested parties just because that was intels business model. intel was historically slacking & their one-size-fits-most approach wasn't meeting the engineering requirements apple was after generation after generation, so apple took the cpu destiny into their own hands and made their own. if you feel like non-apple laptop chips aren't living up to that kind of perf/ppu.... well yeah you'd be right. but that's not really apples fault. that's not a monopoly thing, like at all. either laptop manufs need to go make their own chip (unlikely) or intel/qualcomm/etc need to catch up.
If we have a right to repair (we broadly do not, AFAICT), then that doesn't necessarily mean that we have a right to modify and/or add new functionality.
When I repair a widget that has become broken, I merely return it to its previous non-broken state. I might also decide to upgrade it in some capacity as part of this repair process, but the act of repairing doesn't imply upgrades. At all.
> No OS provider should be allowed to dictate what software you can or not run on your own device and / or OS you have paid for.
I agree completely, but here we are anyway. We've been here for quite some time.
Because of that, you need an apple device around to be able to deal with iMessage users.
In my bubble literally noone uses iMessage. More tech savvy use Signal/GroupMe, less tech savvy use SMS/Email. Family use Signal to chat with me, as I can steer my own family a little.
Also I sometimes open web-interface of Facebook, but any attempts to offer WhatsApp I answer "sorry no Facebook apps on my phone, no Instagram/Messenger either". Never had any issues with that. Although I heard some countries are very dependent on Facebook, so might be hard there.
By the way, I noticed it's not hard to use multiple messengers actually, sometimes it's even faster to find a contact as you always remember what app to look at in recents.
UPDATE: My point is that you can also influence your life and how people communicate with you. Up to a point of course, but it's not like you can do nothing with it.
You've listed a whole bunch of alternatives available to you, but for some reason you demand that Apple change its unique offering into just another one of those for you. Why? Is that not a completely enforced monoculture?
Apple has always been off to the side, doing their own thing, and for some reason that fact utterly enrages people. They demand that Apple become just like everyone else. But we already have everyone else! And in every single field Apple is in, there is more of everyone else than there is of Apple.
Have you considered people like Apple products precisely because they're not like everything else? That making Apple indistinguishable from Facebook or Google is no victory, but a significant loss for customer choice?
I have been an Android user for the past 15 years, and somehow iMessage has never been a problem. Most of the time I don't even know if someone uses iMessage or not.
Thanks to Apple co-opting phone numbers, there's literally no need to ever have iMessage for anyone
The machine I'm using now represents my choices and matches what matters to me, and works closer to perfectly than all my machines in the past
And yes, I have worked with macs, and no, the UX and the entire tyranny in the Apple ecosystem was not something I could live with
And yes, this machine is fast, predictable, a joy to work with and is a tool I control, not a tool to control me. If something happens to it, I can order the part with the same price that goes into a new machine, and keep using my laptop
Like, for phones, I want a phone which runs Linux, has NFC support, and also has iMessage so my friend who only communicates with blue-bubbles and will never message a green-bubble will still talk to me. I also want it to have regulatory approval in the country I live in so I can legally use it to make calls.
Because apple has closed the iMessage ecosystem such that a linux phone can't use it, such a device is impossible. I cannot vote for it.
As such, I will complain about every phone I own for the foreseeable future.
I hope it'll work on an M4 Mac Mini. Does anyone know what hardware to get? You'll need a full ATX PSU to supply power, right? And then tinygrad can do LLM inference on it?
Takes a standard PSU. However, Mac Minis don't have occulink. So you might be a bit limited by whatever USB C can do.
Now if Intel can get there Arc drivers in order we'll see some real budget fun.
https://www.newegg.com/intel-arc-pro-b70-32gb-graphics-card/...
32 GB of VRAM for 1000$. Plus a 500$ Mac Mini.
Article mentions: "Apple finally approved our driver for both AMD and NVIDIA"
Does not mention Intel (GPUs). Select AMD GPUs work on macOS, but...
Macs (both Intel and ARM) support TB, but eGPU only work on Intel Macs, and basically only with AMD.
Good news is for medium end gaming choices are solid, and CUDA works on AMD these days.
I own one of these, the cage is just a piece of plastic. Anyway, I don't think 80$ is that big of a difference here. I can't really afford a 4k Nvidia GPU. Intel is my only hope.
It would work just like a discrete GPU when doing CPU+GPU inference: you'd run a few shared layers on the discrete GPU and place the rest in unified memory. You'd want to minimize CPU/GPU transfers even more than usual, since a Thunderbolt connection only gives you equivalent throughput to PCIe 4.0 x4.
How big a bottleneck is Thunderbolt 5 compared to an SSD? Is the 120 Gbps mode only available when linked to a monitor?
for thunderbolt enclosures, consider going through the list - https://egpu.io/best-egpu-buyers-guide/#tb3-enclosures
zero idea about mac support so YMMV.
Or I could have totally misunderstood the role of Docker in this.
My read of everything is that they are using Docker for NVIDIA GPUs for the sake of "how do you compile code to target the GPU"; for AMD they're just compiling their own LLVM with the appropriate target on macOS.
I would definitely be into this if adding an egpu was first class supported.
I got an eGPU back in 2018 and could never get it to work. To the point that it soured me from doing it again.
These days for heavy duty work I just offload to the cloud. This all feels like NVidia trying to be relevant versus ARM.
Except it's done by a third group, tinygrad, so it's more non-nvidia people wanting to use nvidia hardware one Apple hardware, than "nvidia trying to be relevant".
For well over the previous decade Apple has not allowed newer nVidia GPUs (by not allowing drivers).
A seven year old GPU (e.g. VEGA64, RTX1080Ti) can still process more tokens/second than most Apple Silicon (particularly the lower-ends).
As discussed elsewhere, Apple MAX/Ultra processors are best-suited for huge models (but are not as fast as e.g. RTX5090).
>>Apple approves...
This is a big deal.
I hooked up a Radeon RX 9060 XT to my Feodra KDE laptop (Yoga Pro 7 14ASP9) using a Razer Core X Chroma (40Gbps), and the performance when using the eGPU was very similar to using the Radeon 880M built into the laptop's Ryzen 9 365 APU.
So at least with my setup, performance is not great at all.
On paper, TB4 is capable of pushing 5GB/s, which is somewhere between 4x and 8x of PCIe 3.0, while a 16x PCIe 4.0 link can do ~31.5GB/s.
For numbers about all PCIe generations and lane counts, see the "History and revisions" section here: https://en.wikipedia.org/wiki/PCI_Express
Edit to add: the performance I measured is in gaming workloads, not compute
First, you need to connect the display directly to the eGPU rather than to the laptop.
Second, you need to make sure you have enough VRAM to minimize texture streaming during gameplay.
Third, you'll typically see better performance in terms of higher settings/resolutions vs higher framerates at lower settings/resolutions.
Fourth, depending on your system, you may be bottlenecked by other peripherals sharing PCH lanes with the Thunderbolt connection.
Finally, depending on the Thunderbolt version, PCIe bandwidth can be significantly lower than the advertised bandwidth of the Thunderbolt link. For example, while Thunderbolt 3 advertises 40 Gbps, and typically connects via x4 PCIe 3.0 (~32 Gbps), for whatever reason it imposes a 22 Gbps cap on PCIe data over the Thunderbolt link.
Even taking all this into account, you'll still see a significant performance drop on a current-gen GPU when running over Thunderbolt, though I'd still expect a useful performance improvement over integrated graphics in most cases (though not necessarily worth the cost of the eGPU enclosure vs just buying a cheap used minitower PC on eBay and gaming on that instead of a laptop).
Using proprietary connectors.
> XHCI
Not on Lightning.
> AHCI
How exactly would Apple not support AHCI?
> Using proprietary connectors.
Not for the past decade; it's been no connectors for most products, but standard PCIe connectors for the Mac Pro, and NVMe over Thunderbolt works fine.
>> XHCI
> Not on Lightning.
Again, not relevant to any recent products. And I'm pretty sure you're misunderstanding what XHCI is if you think anything with a Lightning connector is relevant here (XHCI is not USB 3.0). You can connect a Thunderbolt dock that includes an XHCI USB host controller and it works out of the box with no further driver or software support. I assume you can do the same with a USB controller card in a Mac Pro.
>> AHCI
> How exactly would Apple not support AHCI?
This might be another case of you not understanding what you're talking about and are lost in an entirely different layer of the protocol stack. Not supporting AHCI would be easy, since they're no longer selling any products that use SATA, and PCIe SSDs that use AHCI instead of NVMe died out a decade ago. But as far as I know, a SATA controller card at the far end of a Thunderbolt link or in a Mac Pro PCIe slot should still work, if the SATA controller uses AHCI instead of something proprietary as is typical for SAS controllers.
For the same reason that Microsoft requires Windows driver signing?
Drivers run with root permissions.
Isn't that the whole point of the walled garden, that they approve things? How could they aim and realize a walled garden without making things like that have to pass through them?
Because third party drivers usually are utter dogshit. That's how Apple managed to get double the battery life time even in the Intel era over comparable Windows based offerings.
Modern Mac is Macintosh descendants and by contrast PC is IBM PC descendants (their real name is technically PC-clone but because IBM PC don’t exist anymore the clone part have been scrapped).
And with Apple silicon Mac the two is again very different, for example Mac don’t use NVMe, they use just nand (their controller part is integrated in the SoC) and they don’t use UEFI or BIOS, but a combination of Boot ROM, LLB and iBoot
Redirect: https://x.com/*
If Apple was in the high-end server market, I see no reason why the company I was working for would not be running macOS on Apple hardware as servers, instead of the fleet of Linux based servers they had.