undefined | Better HN

0 pointsechelon2d ago0 comments

We don't need rinky-dink RTX models that budget VRAM.

We need large scale open weights models just as capable as what's at the frontier.

And we need the ability to rent compute and spin up the weights easily. One-click, easy enough for anyone. Easier than nerd tools like ComfyUI, Claw, and node graph garbage.

Freedom is owning very large scale weights. Anything less is subsistence.

0 comments

10 comments · 2 top-level

ktallett2d ago· 7 in thread

We need to improve the waster and energy usage and this method doesn't. Most are not reinventing the wheel, a shared AI repository, communicated between online local computers would save a lot of need for these large models.

simonw2d ago

I'd love to see credible numbers on the energy usage of thousands of people running models on their own devices compared to sharing data center resources to run big models that serve many different people at the same time.

My hunch is that the energy/water usage of the data centers is a whole lot more efficient than everyone running at home, but I'd be interested in seeing real data on that.

Windchaser2d ago

Water usage goes up with data centers because more cooling is needed when you run the hardware harder.

So: if you're running the models on your own machine, presumably you're not running them as often, and air cooling is sufficient. But, at the same time, this is less efficient in terms of hardware use; the data centers need water cooling specifically because they're getting more bang from their buck from their hardware, by running their hardware harder.

So that's the tradeoff: more hardware-use efficiency means more water usage.

2 more replies

verdverm2d ago

With hardware like the Spark and Strix, the water usage is known to be zero, yea?

On the energy front, I assume less efficient, but I also think there is a tradeoff in efficiency versus freedom, that's why I have my own hardware.

2 more replies

cold_harbor2d ago

the comparison misses that local LLM usage covers tasks you'd never send to an API — private code, offline work, medical notes. the baseline is 'local vs not-doing-it', not 'local vs cloud'

1 more reply

echelonOP2d ago

NO!

This is the wrong approach that will turn us into serfs. We need big honking models that do what the leading foundation hyperscaler models do to within a few percentage points of measured performance.

The small-scale models are not productive, and the duct tape solutions built on top of them are hobbyist-tier "year of Linux on desktop" toys.

I imagine fedora-wearing, crypto-shilling, coupon-cutting boffins every time I see small weights thing lauded as the future. This is the Pine Phone F-Droid of AI.

"SMS works most of the time on my phone, I swear! I don't really need my banking app!"

That is not big model energy.

Nothing outside of the top ten is worth spending any time on, and we need to focus on models that bridge the gap.

You're talking about impractical toys for highly technical people wasting their own time. That doesn't move the needle or have any economic impact on the competitive landscape.

We need sharp teeth that bite at the legs of the top-tier foundation labs and hold them back from running away with the prize.

We've been through this time and time again over the last thirty years. It's the same shaped problem as before. We don't need toys - we need real infra for real people paying money to do work. Not freeware for freeloaders who don't spend and invest in the problem space.

Large models fit that precisely, because it forces investment into a wide variety of open infra, routers, inference engines, etc. Not to mention the weights ecosystem itself.

ktallett2d ago

Firstly, unless you are the leader of any of the faangs, you are a serf on the whole, if you believe in that philosophy as being relevant.

We need the right tool for the job. Certain models have minimum energy expense no matter what the task is and that's often wasted, both on the scale of some tasks and also repetition.

There is a place and a need for large models, local models, and single purpose models. The same way there is a need for HPC and single board.

1 more reply

robwwilliams2d ago

I initial thought this was a great tongue-in-cheek comment. I still think it is a joke.

1 more reply

xtracto1d ago· 1 in thread

we need this: https://news.ycombinator.com/item?id=48516751

> distributed LLM inference. We are at a point where no single person can setup a rig to run a SOTA model, it is just too expensive. So we must build and adopt frameworks that allow individuals to share resources to run SOTA models in a distributed manner. That way they will also be non-censorable by governments.

Also The only way to prevent that one entity weaponizes it, is by giving EVERYONE access to it.

echelonOP1d ago

Just rent an H200.

You rent your fiber optic internet. You're doing just fine. The world isn't collapsing because you don't own the hardware racks, routers, and fiber lines.

This crazy zany P2P communal infra is Arch Linux coded - too much work, aimed at the 0.001% of users, and the juice isn't worth the squeeze.

It sounds the exact same as being mad at your ISP, so you want to build a mesh internet protocol over microwave dishes and share with everyone in your neighborhood. That was a thing in the 00's. And it predictably got nowhere and delivered no value.

Nobody's got time for this kind of stuff except for hobbyists. It's not solving a real problem. Nor does it attract business investment to grow into a healthy offering.

Just build OpenRunPod instead.

The problem is that the world doesn't have access to frontier open weights. That's the only problem to focus on and solve.

The quickest way to get there is to use large scale open weights instead of little rinky dink RTX distillations. And the easiest way to run them is by renting H100s or using a managed service offering. There's nothing wrong with either of those options.

The infrastructure is the easy part anyway.

j / k navigate · click thread line to collapse

0 comments

10 comments · 2 top-level

ktallett2d ago· 7 in thread

simonw2d ago

My hunch is that the energy/water usage of the data centers is a whole lot more efficient than everyone running at home, but I'd be interested in seeing real data on that.

Windchaser2d ago

Water usage goes up with data centers because more cooling is needed when you run the hardware harder.

So that's the tradeoff: more hardware-use efficiency means more water usage.

2 more replies

verdverm2d ago

With hardware like the Spark and Strix, the water usage is known to be zero, yea?

On the energy front, I assume less efficient, but I also think there is a tradeoff in efficiency versus freedom, that's why I have my own hardware.

2 more replies

cold_harbor2d ago

the comparison misses that local LLM usage covers tasks you'd never send to an API — private code, offline work, medical notes. the baseline is 'local vs not-doing-it', not 'local vs cloud'

1 more reply

echelonOP2d ago

NO!

The small-scale models are not productive, and the duct tape solutions built on top of them are hobbyist-tier "year of Linux on desktop" toys.

I imagine fedora-wearing, crypto-shilling, coupon-cutting boffins every time I see small weights thing lauded as the future. This is the Pine Phone F-Droid of AI.

"SMS works most of the time on my phone, I swear! I don't really need my banking app!"

That is not big model energy.

Nothing outside of the top ten is worth spending any time on, and we need to focus on models that bridge the gap.

You're talking about impractical toys for highly technical people wasting their own time. That doesn't move the needle or have any economic impact on the competitive landscape.

We need sharp teeth that bite at the legs of the top-tier foundation labs and hold them back from running away with the prize.

Large models fit that precisely, because it forces investment into a wide variety of open infra, routers, inference engines, etc. Not to mention the weights ecosystem itself.

ktallett2d ago

Firstly, unless you are the leader of any of the faangs, you are a serf on the whole, if you believe in that philosophy as being relevant.

We need the right tool for the job. Certain models have minimum energy expense no matter what the task is and that's often wasted, both on the scale of some tasks and also repetition.

There is a place and a need for large models, local models, and single purpose models. The same way there is a need for HPC and single board.

1 more reply

robwwilliams2d ago

I initial thought this was a great tongue-in-cheek comment. I still think it is a joke.

1 more reply

xtracto1d ago· 1 in thread

we need this: https://news.ycombinator.com/item?id=48516751

Also The only way to prevent that one entity weaponizes it, is by giving EVERYONE access to it.

echelonOP1d ago

Just rent an H200.

You rent your fiber optic internet. You're doing just fine. The world isn't collapsing because you don't own the hardware racks, routers, and fiber lines.

This crazy zany P2P communal infra is Arch Linux coded - too much work, aimed at the 0.001% of users, and the juice isn't worth the squeeze.

Nobody's got time for this kind of stuff except for hobbyists. It's not solving a real problem. Nor does it attract business investment to grow into a healthy offering.

Just build OpenRunPod instead.

The problem is that the world doesn't have access to frontier open weights. That's the only problem to focus on and solve.

The infrastructure is the easy part anyway.

j / k navigate · click thread line to collapse