undefined | Better HN

story

0 pointsepolanski1y ago0 comments

I really don't see a correlation here to be honest.

Eventually all future AIs will be produced with synthetic input, the amount of (quality) data we humans can produce is quite limited.

The fact that the input of one AI has been used in the training of another one seems irrelevant.

0 comments

glenstein1y ago

The issue isn’t just that AI trained on AI is inevitable it's whose AI is being used as the base layer. Right now, OpenAI’s models are at the top of that hierarchy. If Deepseek depended on them, it means OpenAI is still the upstream bottleneck, not easily replaced.

The deeper question is whether Deepseek has achieved real autonomy or if it’s just a derivative work. If the latter, then OpenAI still holds the keys to future advances. If Deepseek truly found a way to be independent while achieving similar performance, then OpenAI has a problem.

The details of how they trained matter more than the inevitability of synthetic data down the line.

janalsncm1y ago

> whether Deepseek has achieved real autonomy or if it’s just a derivative work

This question is malformed, imo. Every lab is doing derivative work. OpenAI didn’t invent transformers, Google did. Google didn’t invent neural networks or back propagation.

If you mean whether OAI could have prevented DS from succeeding by cutting off their API access, probably not. Maybe they used OAI for supervised fine tuning in certain domains, like creative writing, which are difficult to formally verify (although they claim to have used one of their own models). Or perhaps during human preference tuning at the end. But either way, there are many roads to Rome, and OAI wasn’t the only game in town.

epolanskiOP1y ago

> then OpenAI still holds the keys to future advances

Point is, those future advances are worthless. Eventually anybody will be able to feed each other's data for the training.

There's no moat here. LLMs are commodities.

glenstein1y ago

If LLMs were already pure commodities, OpenAI wouldn't be able to charge a premium, and DeepSeek wouldn’t have needed to distill their model from OpenAI in the first place. The fact that they did proves there’s still a moat—just maybe not as wide as OpenAI hoped.

1 more reply

j / k navigate · click thread line to collapse

0 comments

glenstein1y ago

The details of how they trained matter more than the inevitability of synthetic data down the line.

janalsncm1y ago

> whether Deepseek has achieved real autonomy or if it’s just a derivative work

This question is malformed, imo. Every lab is doing derivative work. OpenAI didn’t invent transformers, Google did. Google didn’t invent neural networks or back propagation.

epolanskiOP1y ago

> then OpenAI still holds the keys to future advances

Point is, those future advances are worthless. Eventually anybody will be able to feed each other's data for the training.

There's no moat here. LLMs are commodities.

glenstein1y ago

1 more reply

j / k navigate · click thread line to collapse