undefined | Better HN

0 pointssho_hn1y ago0 comments

While I'm as amused as everyone else - I think it's technically accurate to point out that the "we trained it for $6 mio" narrative is contingent on the done investment by others.

0 comments

Palmik1y ago

When I use NVIDIA GPUs to train a model, I do not consider the R&D cost to develop all of those GPUs as part of my costs.

When I use an API to generate some data, I do not consider the R&D cost to develop the API as part of my costs.

kobalsky1y ago

OpenAI has been in a war-room for days searching for a match in the data, and they just came out with this without providing proof.

My cynical opinion is that the traning corpus has some small amount of data generated by OpenAI, which is probably impossible to avoid at this point, and they are hanging on that thread for dear life.

bbqfog1y ago

OpenAI's models were also trained on billions of dollars of "free" labor that produced the content that it was trained on.

sho_hnOP1y ago

Oh, absolutely. I'm not defending OpenAI, I just care about accurate reporting. Even on HN - even in this thread - you see people who came away with the conclusion that DeepSeek did something while "cutting cost by 27x".

But that's a bit like saying that by painting a a bare wall green you have demonstrated that you can build green walls 27x cheaper, ignoring the cost of building the wall in the first place.

Smarter reporting and discourse would explain how this iterative process actually works and who is building on who and how, not frame it as two competing from-scratch clean room efforts. It'd help clear up expectations of what's coming next.

It's a bit similar to how many are saying DeepSeek have demonstrated independence from nVidia, when part of the clever thing they did was figure out how to make the intentionally gimped H800s work for their training runs by doing low-level optimizations that are more nVidia-specific, etc.

Rarely have I seen a highly technical topic see produce more uninformed snap takes than this week.

Palmik1y ago

You are underselling or not understanding the breakthrough. They trained 600B model on 15T tokens for <$6/m. Regardless of the provenance of the tokens, this in itself is impressive.

Not to mention post-training. Their novel GRPO technique used for preference optimization / alignment is also much more efficient than PPO.

1 more reply

visarga1y ago

> that's a bit like saying that by painting a a bare wall green you have demonstrated that you can build green walls 27x cheaper, ignoring the cost of building the wall in the first place

That's a funny analogy, but in reality DeepSeek did reinforcement learning to generate chain of thought, which was used in the end to finetune LLMs. The RL model was called DeepSeek-R1-Zero, while the SFT model is DeepSeek-R1.

They might have boostrapped the Zero model with some demonstrations.

> DeepSeek-R1-Zero struggles with challenges like poor readability, and language mixing. To make reasoning processes more readable and share them with the open community, we explore DeepSeek-R1, a method that utilizes RL with human-friendly cold-start data.

> Unlike DeepSeek-R1-Zero, to prevent the early unstable cold start phase of RL training from the base model, for DeepSeek-R1 we construct and collect a small amount of long CoT data to fine-tune the model as the initial RL actor. To collect such data, we have explored several approaches: using few-shot prompting with a long CoT as an example, directly prompting models to generate detailed answers with reflection and verification, gathering DeepSeek-R1Zero outputs in a readable format, and refining the results through post-processing by human annotators.

bbqfog1y ago

I don't agree. Walls are physical items so your example is true, but models are data. Anyone can train off of these models, that's the current environment we exist in. Just like OpenAI trained on data that has since been locked up in a lot of cases. In 2025 training models like Deepseek is indeed 27x cheaper, that includes both their innovations and the existence of new "raw material" to do such a thing.

1 more reply

scotty791y ago

The opposite, is claiming that OpenAI could have now built better performing, cheaper to run model (when compared to what they published) training it at 1% cost on output of their previous models. ... But they chose not to do it.

freehorse1y ago

That is the case anyway for training any llm. It is contingent on the work done by all those who produced the data.

j / k navigate · click thread line to collapse

0 comments

Palmik1y ago

When I use NVIDIA GPUs to train a model, I do not consider the R&D cost to develop all of those GPUs as part of my costs.

When I use an API to generate some data, I do not consider the R&D cost to develop the API as part of my costs.

kobalsky1y ago

OpenAI has been in a war-room for days searching for a match in the data, and they just came out with this without providing proof.

bbqfog1y ago

OpenAI's models were also trained on billions of dollars of "free" labor that produced the content that it was trained on.

sho_hnOP1y ago

But that's a bit like saying that by painting a a bare wall green you have demonstrated that you can build green walls 27x cheaper, ignoring the cost of building the wall in the first place.

Rarely have I seen a highly technical topic see produce more uninformed snap takes than this week.

Palmik1y ago

You are underselling or not understanding the breakthrough. They trained 600B model on 15T tokens for <$6/m. Regardless of the provenance of the tokens, this in itself is impressive.

Not to mention post-training. Their novel GRPO technique used for preference optimization / alignment is also much more efficient than PPO.

1 more reply

visarga1y ago

> that's a bit like saying that by painting a a bare wall green you have demonstrated that you can build green walls 27x cheaper, ignoring the cost of building the wall in the first place

They might have boostrapped the Zero model with some demonstrations.

bbqfog1y ago

1 more reply

scotty791y ago

freehorse1y ago

That is the case anyway for training any llm. It is contingent on the work done by all those who produced the data.

j / k navigate · click thread line to collapse