This setup will work for 100B models as well. And yes, the Mac will draw less power, but the Nvidia machine will be many times faster. So depending on your specific Mac and your specific Nvidia setup, the performance per watt will be in the same ballpark. And higher absolute performance is certainly a nice perk.
> You can certainly daisy chain several 3090's together but it doesn't work seamlessly.
Citation needed; there's no "daisy chaining" in the setup I describe, and low level libraries like pytorch as well as higher level tools like Ollama all seamlessly support multiple GPUs.