undefined | Better HN

0 pointsrichwater11mo ago0 comments

If you're okay with lower quality output, a $10k Mac Studio will get you there. But you _will_ have to accept lower quality outputs compared to todays' frontier models.

0 comments

6 comments · 3 top-level

flashgordon11mo ago· 3 in thread

Yeah I was actually thinking about a proper rig - My gut feel is a rig wouldnt be as expensve as a mac and would actually have a higher ROI (at the expense of portability)?

My other worry about the mac is how unupgradable it is. Again not sure how fruitful it is - in my (probably fantasy land) view if I can setup a rig and then keep updating components as needed - it might last me a good 5 years say for 20k over that period? Or is that too hopeful?

So for 20K over 5 years or 4k per year - it comes to about 400 a month (ish). The equivalent of 2 MAX pro subscriptions. Let us be honest - right now with these limits running more than 1 in parallel is going to be forbidden.

if I can run 2 claude level models (assuming the DS and Qwens are there) then I am already breaking even but without having to participating in training with all my codebases (and I assume I can actually unlock something new in the process of being free).

lossolo11mo ago

Buy 4–8 used 3090s (providing 96–192 GB of VRAM), depending on the model and weight quantization you want to run. Used 3090 costs around $800. Add more RAM to offload layers if needed. This setup currently offers the best value for performance.

https://www.reddit.com/r/LocalLLaMA/comments/1iqpzpk/8x_rtx_...

You can look for more rig examples on that subreddit.

esskay11mo ago

I do wonder what the ongoing cost there would be. The ~$9k hardware cost is an easy thing to quantify, but going with a bank of very hot, power hungry GPU's is going to rack up a hefty monthly bill in many parts of the world.

I imagine theres also going to be some problems hooking something like that up to a normal wall socket in North America? (I like the reddit poster am in Europe so on 220v)

2 more replies

flashgordon11mo ago

Also I wonder if like the old days you could "try" these out somewhere first. Imaging plonking down 5-10k and nothing works (which is fine if you can get a refund ha).

OtherShrezzing11mo ago

>But you _will_ have to accept lower quality outputs compared to todays' frontier models.

I'm curious how much lower quality we're talking about here. Most of the work I ever get an LLM to do is glue-code, or trivial features. I'd expect some fine-tuned Codestral type model with well focused tasks could achieve good performance locally. I don't really need worlds-leading-expert quality models to code up a hamburger menu in a React app & set the background-color to #A1D1C1.

gnator11mo ago

Has anyone tried running with a tenstorrent card? Wanted to see how they fare

j / k navigate · click thread line to collapse