undefined | Better HN

0 pointssegmondy1mo ago0 comments

Jokes on you. We are already running Deepseekv4Flash, Mimo2.5, MiniMax2.7, Qwen3-397B locally in very affordable hardware. These models are in the real of Opus4.6. For those of us a bit crazy, we are running KimiK2.6, GLM5.1 and more ...

0 comments

7 comments · 2 top-level

binyu1mo ago· 3 in thread

They all still fall short of Opus 4.6, definitely though. They are good but fail on extremely complex tasks, in contrast with a frontier model that will keep on trying until it succeeds or exhausts the solutions space.

julianlam1mo ago

Not by much, and moving goalposts makes for a bad comparison. Local open weight models are already more powerful than frontier models from only a year back.

If you believe what you read here, the gap is closing fast.

segmondyOP1mo ago

frontier models don't keep trying until they succeed. that's a harness problem and best believe it, the best harness are private and not public.

binyu1mo ago

It is much more of a context window size and model capabilities problem. Local models are not even remotely close in solving complex problems, even when used with the same harness.

root_axis1mo ago· 2 in thread

I have two A100s and have been playing with local models for years. There's definitely moments where they are quite impressive, but small context sizes and unreliability become immediately obvious.

> For those of us a bit crazy, we are running KimiK2.6, GLM5.1

Yes, those can compare to Opus, but you can't run those unquantized for less than $400k in hardware.

doctorpangloss1mo ago

Two Mac Studio M3 Ultra 512GB and 1 USB cable can run all those models - maybe about $30,000 in hardware - and based on my benchmarks, those Mac Studios were twice as fast as the A100s on Deepseek v4 Flash, which has a quantization but not really a lossy one.

root_axis1mo ago

That cannot run KimiK2.6 or GLM5.1 i.e models within the ballpark of anything offered by frontier companies.

2 more replies

j / k navigate · click thread line to collapse