undefined | Better HN

story

0 pointsriantogo1y ago0 comments

Why would it cast any doubt? If you can use o1 output to build a better R1. Then use R1 output to build a better X1... then a better X2.. XN, that just shows a method to create better systems for a fraction of the cost from where we stand. If it was that obvious OpenAI should have themselves done. But the disruptors did it. It hindsight it might sound obvious, but that is true for all innovations. It is all good stuff.

0 comments

Imnimo1y ago

I think it would cast doubt on the narrative "you could have trained o1 with much less compute, and r1 is proof of that", if it turned out that in order to train r1 in the first place, you had to have access to bunch of outputs from o1. In other words, you had to do the really expensive o1 training in the first place.

(with the caveat that all we have right now are accusations that DeepSeek made use of OpenAI data - it might just as well turn out that DeepSeek really did work independently, and you really could have gotten o1-like performance with much less compute)

deepGem1y ago

From the R1 paper

In this study, we demonstrate that reasoning capabilities can be significantly improved through large-scale reinforcement learning (RL), even without using supervised fine-tuning (SFT) as a cold start. Furthermore, performance can be further enhanced with the inclusion of a small amount of cold-start data

Is this cold start data what OpenAI is claiming their output ? If so what's the big deal ?

Imnimo1y ago

DeepSeek claims that the cold-start data is from DeepSeekV3, which is the model that has the $5.5M pricetag. If that data were actually the output of o1 (a model that had a much higher training cost, and its own RL post-training), that would significantly change the narrative of R1's development, and what's possible to build from scratch on a comparable training budget.

TheGeminon1y ago

In the paper DeepSeek just says they have ~800k responses that they used for the cold start data on R1, and are very vague about how they got it:

> To collect such data, we have explored several approaches: using few-shot prompting with a long CoT as an example, directly prompting models to generate detailed answers with reflection and verification, gathering DeepSeek-R1-Zero outputs in a readable format, and refining the results through post-processing by human annotators.

1 more reply

joe_the_user1y ago

It's like the claim "they showed anyone create a powerful from scratch" becomes "false yet true".

Maybe they needed OpenAI for their process. But now that their model is open source, anyone can use that as their cold start and spend the same amount.

"From scratch" is a moving target. No one who makes their model with massive data from the net is really doing anything from scratch.

1 more reply

Loic1y ago

Not for me. As I build a chemical factory, I do not reinvent everything.

They are using the current SOTA tools and models to build new models for cheaper.

1 more reply

powerapple1y ago

I lean on the idea that R1-Zero was trained from cold start, at the same time, they have tried many things including using OpenAI APIs. These things can happen in parallel.

manquer1y ago

> you had to do the really expensive o1 training in the first place

It is no better for OpenAI in this scenario either, any competitor can easily copy their expensive training without spending the same, i.e. there is a second mover advantage and no economic incentive to be the first one.

To put it another way, the $500 Billion Stargate investment will be worth just $5Billion once the models become available for consumption, because it only will take that much to replicate the same outcomes with new techniques even if the cold start needed o1 output for RL.

hattmall1y ago

Shouldn't OpenAI be able to rather easily detect such usage?

hmmm-i-wonder1y ago

Now that its been done, is OpenAI needed or can you iterate on DeepSeek only moving forward?

My understanding is this effectively builds on OpenAI's very expensive initial work, provides a "nearly as good as" model for orders of magnitude cheaper to train and run, that also provides a basis to continue building on and improving without openAI, and without human bottlenecks.

That cuts OAI off at the knees in terms of market viability after billions have been spent. If DS can iterate and match the capabilities of the current in-development OAI models in the next year, it may come down to regulatory capture and government intervention to ensure its viability as a company.

1 more reply

MrLeap1y ago

o1 wouldn't exist without the combined compute of every mind that led to the training data they used in the first place. How many h100 equivalents are the rolling continuum of all of human history?

dchichkov1y ago

It should be possible to learn to reason from scratch. And the ability to reason in a long context seems to be very general.

Nevermark1y ago

How does one learn reasoning from scratch?

Human reasoning, as it exists today, is the result of tens of thousands of years of intuition slowly distilled down to efficient abstract concepts like "numbers", "zero", "angles", "cause", "effect", "energy", "true", "false", ...

I don't know what reasoning from scratch would look like without training on examples from other reasoning beings. As human children do.

5 more replies

MrLeap1y ago

Creating reasoning from scratch is the same task as creating an apple pie from scratch.

First you must invent the universe.

1 more reply

miki1232111y ago

It is possible to learn to reason from scratch, that's what R1-0 did, but the resulting chains of thought aren't legible to humans.

To quote DeepSeek directly:

> DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates cold-start data before RL.

1 more reply

PeterStuer1y ago

Possible? I guess evolution did it over the course of a few billion years. For engineering purposes, starting from the best advanced position seems far more efficient.

soulofmischief1y ago

I've been giving this a lot of thought over the last few months. My personal insight is that "reasoning" is simply the application of a probabilistic reasoning manifold on an input in order to transform it into constrained output that serves the stability or evolution of a system.

This manifold is constructed via learning a decontextualized pattern space on a given set of inputs. Given the inherent probabilistic nature of sampling, true reasoning is expressed in terms of probabilities, not axioms. It may be possible to discover axioms by locating fixed points or attractors on the manifold, but ultimately you're looking at a probabilistic manifold constructed from your input set.

But I don't think you can untie this "reasoning" from your input data. It's possible you will find "meta-reasoning", or similar structures found in any sufficiently advanced reasoning manifold, but these highly decontextualized structures might be entirely useless without proper recontextualization, necessitating that a reasoning manifold is trained on input whose patterns follow learnable underlying rules, if the manifold is to be useful for processing input of that kind.

Decontextualization is learning, decomposing aspects of an input into context-agnostic relationships. But recontextualization is the other half of that, knowing how to take highly abstract, sometimes inexpressible, context-agnostic relationships and transform them into useful analysis in novel domains.

This doesn't mean a well-trained model can't reason about input it hasn't encountered before, just that the input needs to be in some way causally connected to the same laws which governed the input the manifold was trained on.

I'm sure we could create a fully generalized reasoning manifold which could handle anything, but I don't see how we possibly get that without first considering and encountering all possible inputs. But these inputs still have to have some form of constraint governed by laws that must be learned through sampling, otherwise you'd just be training on effectively random data.

The other commenter who suggested simply generating all possible sentences and training on internal consistency should probably consider Gödel's incompleteness theorems, and that internal consistency isn't enough to accurately model and interpret the universe. One could construct a thought experiment about an isolated brain in a jar with effectively unlimited neuronal connections, but no sensory connection to the outside world. It's possible, with enough connections, that the likelihood of the brain conceiving of true events it hasn't actually encountered does increase meaningfully. But the brain still has nothing to validate against, and can't simply assume that because something is internally logically consistent, that it must exist or have existed.

vkou1y ago

If OpenAi had to account for the cost of producing all the copyrighted material they trained their LLM on, their system would be worth negative trillions of dollars.

Let's just assume that the cost of training can be externalized to other people for free.

fakedang1y ago

Even if what OpenAI asserts in the title of this post is true, then their system is worth negative trillions of dollars.

If other players can access that data with relatively less effort, then it's futile trying to train your models and improve upon them, as clearly you don't have an architectural moat, just a training moat.

Kind of like an office scene where an introverted hardworker does all the tedious work, while his extroverted colleague promotes it as his and gains credit.

hmottestad1y ago

At the pace that DeepSeek is developing we should expect them to surpass OpenAI in not that long.

The big question really is, are we doing it wrong, could we have created o1 for a fraction of the price. Will o4 cost less to train than o1 did?

The second question is naturally. If we create a smarter LLM, can we use it to create another LLM that is even smarter?

It would have been fantastic if DeepSeek could have come out with an o3 competitor before o3 even became publicly available. That way we would have known for sure that we’re doing it wrong. Cause then either we could have used o1 to train a better AI or we could have just trained in a smarter and cheaper way.

pertymcpert1y ago

The whole discussion is about whether or not the second case of using o1 outputs to fine tune R1 is what allowed R1 to become so good. If that's the case then your assertion that DeepSeek will surpass OpenAI doesn't really make sense because they're dependent on a frontier model in order to match, not surpass.

hmottestad1y ago

Yeah, that's my point. If they do end up surpassing OpenAI then it would seem likely that they aren't just relying on copying from o1, or whatever model is the frontier model at that time.

cherry_tree1y ago

> I think it would cast doubt on the narrative "you could have trained o1 with much less compute, and r1 is proof of that"

Whether or not you could have, you can now.

SpaceManNabs1y ago

My question is if deepseek r1 is just a distilled o1, i wonder if you can build a fine tuned r1 through distillation without having to fine tune o1.

zombiwoof1y ago

Exactly. They piggybacked of lots of compute and used less. There still is a total sum of a massive amount of compute

cratermoon1y ago

OpenAI piggybacked on the whole internet and the catalogued and shared human knowledge therein.

fmbb1y ago

That’s a lot of watt hours!

PeterStuer1y ago

And lets not forget a gazillion hours of human reinforcement by armies of 3rd world mechanical turks.

bitfilped1y ago

Except OpenAI hasn't shared anything.

TeMPOraL1y ago

Sure. This is fine. Data is still a product, no matter how much businesses would like to turn it into a service.

The model already embodies the "total sum of a massive amount of compute" used to create it; if it's possible to reuse that embodied compute to create a better model, that's good for the world. Forcing everyone to redo all that compute for themselves is, conversely, bad for the world.

RHSman21y ago

Nothing good for the world in this ai race but your comment is very good.

da_chicken1y ago

I mean, yes that's how progress works. Has OpenAI got a patent? If not it's fair game.

We don't make people figure out how to domesticate a cow every time they want a hamburger. Or test hundreds of thousands of filaments before they can have a lightbulb. Inventions, once invented, exist as giants to stand upon. The inventor can either choose to disclose the invention and earn a patent for exclusive rights, or they can try to keep it a secret and hope nobody reverse engineers it.

philipwhiuk1y ago

You mean to create an apple pie from scratch you first have to invent the universe?

rockemsockem1y ago

I think the prevailing narrative ATM is that DeepSeek's own innovation was done in isolation and they surpassed OpenAI. Even though in the paper they give a lot of credit to Llama for their techniques. The idea that they used o1's outputs for their distillation further shows that models like o1 are necessary.

All of this should have been clear anyway from the start, but that's the Internet for you.

joe_the_user1y ago

The idea that they used o1's outputs for their distillation further shows that models like o1 are necessary.

Hmm, I think the narrative of the rise of LLMs is that once the output of humans has been distilled by the model, the human isn't necessary.

As far as I know, DeepSeek adds only a little to the transformers model while o1/o3 added a special "reasoning component" - if DeepSeek is as good as o1/o3, even taking data from it, then it seems the reasoning component isn't needed.

david-gpu1y ago

> I think the narrative of the rise of LLMs is that once the output of humans has been distilled by the model

Distillation is a term of art in AI and it is fundamentally incorrect to talk about distilling human-created data. Only an AI model can be distilled.

https://en.m.wikipedia.org/wiki/Knowledge_distillation#Metho...

joe_the_user1y ago

Meh,

It seems clear that the term can be used informally to denote the boiling down of human knowledge, indeed it was used that way before AI appeared in the popular imagination.

1 more reply

PontifexCipher1y ago

Some info that may be missing:

- v2/v3 (not r1) seem to be cloned from o1/4o output, and perform worse (this cost the oft-repeated 5ish mm USD)

- r1 is specifically a reasoning step (using RL) _on top of_ v2/v3 and performs similarly to o1 (the cost of this is _not reported anywhere_)

- In the o1 blog post, they specifically say they use RL to add reasoning to LLMs: https://openai.com/index/learning-to-reason-with-llms/

sudosysgen1y ago

The R1-Zero paper shows how many training steps the RL took, and it's not many. The cost of the RL is likely a small fraction of the cost of the foundational model.

aprilthird20211y ago

> the prevailing narrative ATM is that DeepSeek's own innovation was done in isolation and they surpassed OpenAI

I did not think this, nor did I think this was what others assumed. The narrative, I thought, was that there is little point in paying OpenAI for LLM usage when a much cheaper, similar / better version can be made and used for a fraction of the cost (whether it's on the back of existing LLM research doesn't factor in)

TheGRS1y ago

Yes, well the narrative that rocked the stock market is different. Its looking at what DeepSeek did and assuming they may have competitive advantage in this space and could outperform OpenAI at their own game.

If the narrative is actually that DeepSeek can only reach whatever heights OpenAI has already gotten to with some new tricks, then markets will probably refocus on OpenAI's innovations and price things accordingly, even if the initial cost is huge. It also means OpenAI probably needs a better moat to protect its interests.

I'm not sure where the reality is exactly, but market reactions so far have basically followed that initial narrative and now the rebuttal.

addicted1y ago

The idea that someone can easily replicate an OpenAI model based simply on OpenAI outputs is, I’d argue, immeasurably worse for OpenAI’s valuation than the idea that someone happened to come up with a few innovations that leapfrogged OpenAI.

The latter could be a one time thing, and/or OpenAi Could still use their financial might to leverage those innovations and get even better with them.

However, the former destroys their business model and no amount of intelligence and innovation from OpenAI protects them from being copied at a fraction of the cost.

aprilthird20211y ago

> Yes, well the narrative that rocked the stock market is different.

How do you know this?

> If the narrative is actually that DeepSeek can only reach whatever heights OpenAI has already gotten to with some new tricks, then markets will probably refocus on OpenAI's innovations and price things accordingly

Why? If every innovation OpenAI is trying to keep as secret sauce becomes commoditized quickly and cheaply, then why would markets care about any innovations they have? They will be unable to monetize them.

1 more reply

paul_e_warner1y ago

There were different narratives for different people. When I heard about r1, my first response was to dig into their paper and it's references to figure out how they did it.

kelnos1y ago

> I did not think this, nor did I think this was what others assumed.

That's what I thought and assumed. This is the narrative that's been running through all the major news outlets.

It didn't even occur to me that DeepSeek could have been training their models using the output of other models until reading this article.

bigfudge1y ago

Fwiw I assumed they were using o1 to train. But it doesn’t matter: the big story here is that massive compute resources are unlikely to be as important in the future as we thought. It cuts the legs off stargate etc just as it’s announced. The CCP must be highly entertained by the timeline.

aiono1y ago

That's only the case if you don't need to use the output of a much more expensive model.

hmmm-i-wonder1y ago

>shows that models like o1 are necessary.

But HOW they are necessary is the change. They went from building blocks to stepping stones. From a business standpoint that's very damaging to OAI and other players.

KingOfCoders1y ago

OpenAI couldn't do it, when the high cost of training and access to GPUs is their competitive advance against startups, they can't admit that it does not exist.

patcon1y ago

Are we it rediscovering the evolutionary benefit of progeny (from an information theoretic lens)?

And is this related to the lottery ticket hypothesis?

https://arxiv.org/pdf/1803.03635.pdf

herodoturtle1y ago

Thanks for the insightful comment.

I have a question (disclaimer: reinforcement learning noob here):

Is there a risk of broken telephone with this?

Kinda like repeatedly compressing an already compressed image eventually leads to a fuzzy blur.

If that is the case then I’m curious how this is monitored and / or mitigated.

ospray1y ago

They did do that themselves it's called o3.

RHSman21y ago

When will over training happen on the melange of models at scale? And will AGI only ever be an extension of this concept?

That is where artificial intelligence is going. Copy things from other things. Will there be a AI Eureka moment where it deviates and knows where and why the reason it is wrong?

indymike1y ago

Bad things happen in tech when you don't do the disrupting yourself.

anothernewdude1y ago

If they're training R1 on o1 output on the benchmarks - then I don't trust those benchmarks results for R1. It means the model is liable to be brittle, and they need to prove otherwise.

dontreact1y ago

Is there any evidence R1 is better than O1?

It seems like if they in fact distilled then what we have found is that you can create a worse copy of the model for ~5m dollars in compute by training on its outputs.

iforgot221y ago

"Then use R1 output to build a better X1" is the part I'm not sure about. Is X1 going to actually be better than R1?

qwertox1y ago

They're standing on the shoulders of giants, not only in terms of re-using expensive computing power almost for free by using the outputs of expensive models. It's a bit of a tradition in that country, also in manufacturing.

unreal371y ago

I thought OpenAI GPT took Wikipedia and the content of every book as inputs to train their models?

Everyone is standing on the shoulders of giants.

qwertox1y ago

What I meant to say was that OpenAI did put a lot of money into extracting value out of the pile of (partially copyrighted) data, and that DeepSeek was freeloading on that investment without disclosing it, making them look more efficient than they truly are.

1 more reply

bigfudge1y ago

How do you think manufacturing in the US got started? Everyone is on someone’s shoulders.

dartos1y ago

What does “better” really even mean here?

Better benchmark scores can be cooked

Sophira1y ago

Honestly, it's kind of silly that this technology is in the hands of companies whose only aim is to make money, IMO.

lenerdenator1y ago

Well, originally, OpenAI wasn't supposed to be that kind of organization.

But if you leave someone in the tech industry of SV/SF long enough, they'll start to get high on their own supply and think they're entitled to insane amounts of value, so...

goatlover1y ago

It's because they're the ones who could raise the money to make those models. Academics don't have access to that kind of compute. But the free models exist.

gmd631y ago

Why not just copy and paste the model and change the name? That's an even more efficient form of distillation.

wgjordan1y ago

Even assuming the model was somehow publicly available in a form that could be directly copied, that would be a more blatant form of copyright infringement. Distillation launders copyrighted material in a way that OpenAI specifically has argued falls under fair use.

j / k navigate · click thread line to collapse

0 comments

Imnimo1y ago

deepGem1y ago

From the R1 paper

Is this cold start data what OpenAI is claiming their output ? If so what's the big deal ?

Imnimo1y ago

TheGeminon1y ago

In the paper DeepSeek just says they have ~800k responses that they used for the cold start data on R1, and are very vague about how they got it:

1 more reply

joe_the_user1y ago

It's like the claim "they showed anyone create a powerful from scratch" becomes "false yet true".

Maybe they needed OpenAI for their process. But now that their model is open source, anyone can use that as their cold start and spend the same amount.

"From scratch" is a moving target. No one who makes their model with massive data from the net is really doing anything from scratch.

1 more reply

Loic1y ago

Not for me. As I build a chemical factory, I do not reinvent everything.

They are using the current SOTA tools and models to build new models for cheaper.

1 more reply

powerapple1y ago

I lean on the idea that R1-Zero was trained from cold start, at the same time, they have tried many things including using OpenAI APIs. These things can happen in parallel.

manquer1y ago

> you had to do the really expensive o1 training in the first place

hattmall1y ago

Shouldn't OpenAI be able to rather easily detect such usage?

hmmm-i-wonder1y ago

Now that its been done, is OpenAI needed or can you iterate on DeepSeek only moving forward?

1 more reply

MrLeap1y ago

o1 wouldn't exist without the combined compute of every mind that led to the training data they used in the first place. How many h100 equivalents are the rolling continuum of all of human history?

dchichkov1y ago

It should be possible to learn to reason from scratch. And the ability to reason in a long context seems to be very general.

Nevermark1y ago

How does one learn reasoning from scratch?

I don't know what reasoning from scratch would look like without training on examples from other reasoning beings. As human children do.

5 more replies

MrLeap1y ago

Creating reasoning from scratch is the same task as creating an apple pie from scratch.

First you must invent the universe.

1 more reply

miki1232111y ago

It is possible to learn to reason from scratch, that's what R1-0 did, but the resulting chains of thought aren't legible to humans.

To quote DeepSeek directly:

1 more reply

PeterStuer1y ago

Possible? I guess evolution did it over the course of a few billion years. For engineering purposes, starting from the best advanced position seems far more efficient.

soulofmischief1y ago

vkou1y ago

If OpenAi had to account for the cost of producing all the copyrighted material they trained their LLM on, their system would be worth negative trillions of dollars.

Let's just assume that the cost of training can be externalized to other people for free.

fakedang1y ago

Even if what OpenAI asserts in the title of this post is true, then their system is worth negative trillions of dollars.

Kind of like an office scene where an introverted hardworker does all the tedious work, while his extroverted colleague promotes it as his and gains credit.

hmottestad1y ago

At the pace that DeepSeek is developing we should expect them to surpass OpenAI in not that long.

The big question really is, are we doing it wrong, could we have created o1 for a fraction of the price. Will o4 cost less to train than o1 did?

The second question is naturally. If we create a smarter LLM, can we use it to create another LLM that is even smarter?

pertymcpert1y ago

hmottestad1y ago

Yeah, that's my point. If they do end up surpassing OpenAI then it would seem likely that they aren't just relying on copying from o1, or whatever model is the frontier model at that time.

cherry_tree1y ago

> I think it would cast doubt on the narrative "you could have trained o1 with much less compute, and r1 is proof of that"

Whether or not you could have, you can now.

SpaceManNabs1y ago

My question is if deepseek r1 is just a distilled o1, i wonder if you can build a fine tuned r1 through distillation without having to fine tune o1.

zombiwoof1y ago

Exactly. They piggybacked of lots of compute and used less. There still is a total sum of a massive amount of compute

cratermoon1y ago

OpenAI piggybacked on the whole internet and the catalogued and shared human knowledge therein.

fmbb1y ago

That’s a lot of watt hours!

PeterStuer1y ago

And lets not forget a gazillion hours of human reinforcement by armies of 3rd world mechanical turks.

bitfilped1y ago

Except OpenAI hasn't shared anything.

TeMPOraL1y ago

Sure. This is fine. Data is still a product, no matter how much businesses would like to turn it into a service.

RHSman21y ago

Nothing good for the world in this ai race but your comment is very good.

da_chicken1y ago

I mean, yes that's how progress works. Has OpenAI got a patent? If not it's fair game.

philipwhiuk1y ago

You mean to create an apple pie from scratch you first have to invent the universe?

rockemsockem1y ago

All of this should have been clear anyway from the start, but that's the Internet for you.

joe_the_user1y ago

The idea that they used o1's outputs for their distillation further shows that models like o1 are necessary.

Hmm, I think the narrative of the rise of LLMs is that once the output of humans has been distilled by the model, the human isn't necessary.

david-gpu1y ago

> I think the narrative of the rise of LLMs is that once the output of humans has been distilled by the model

Distillation is a term of art in AI and it is fundamentally incorrect to talk about distilling human-created data. Only an AI model can be distilled.

https://en.m.wikipedia.org/wiki/Knowledge_distillation#Metho...

joe_the_user1y ago

Meh,

It seems clear that the term can be used informally to denote the boiling down of human knowledge, indeed it was used that way before AI appeared in the popular imagination.

1 more reply

PontifexCipher1y ago

Some info that may be missing:

- v2/v3 (not r1) seem to be cloned from o1/4o output, and perform worse (this cost the oft-repeated 5ish mm USD)

- r1 is specifically a reasoning step (using RL) _on top of_ v2/v3 and performs similarly to o1 (the cost of this is _not reported anywhere_)

- In the o1 blog post, they specifically say they use RL to add reasoning to LLMs: https://openai.com/index/learning-to-reason-with-llms/

sudosysgen1y ago

The R1-Zero paper shows how many training steps the RL took, and it's not many. The cost of the RL is likely a small fraction of the cost of the foundational model.

aprilthird20211y ago

> the prevailing narrative ATM is that DeepSeek's own innovation was done in isolation and they surpassed OpenAI

TheGRS1y ago

I'm not sure where the reality is exactly, but market reactions so far have basically followed that initial narrative and now the rebuttal.

addicted1y ago

The latter could be a one time thing, and/or OpenAi Could still use their financial might to leverage those innovations and get even better with them.

However, the former destroys their business model and no amount of intelligence and innovation from OpenAI protects them from being copied at a fraction of the cost.

aprilthird20211y ago

> Yes, well the narrative that rocked the stock market is different.

How do you know this?

1 more reply

paul_e_warner1y ago

There were different narratives for different people. When I heard about r1, my first response was to dig into their paper and it's references to figure out how they did it.

kelnos1y ago

> I did not think this, nor did I think this was what others assumed.

That's what I thought and assumed. This is the narrative that's been running through all the major news outlets.

It didn't even occur to me that DeepSeek could have been training their models using the output of other models until reading this article.

bigfudge1y ago

aiono1y ago

That's only the case if you don't need to use the output of a much more expensive model.

hmmm-i-wonder1y ago

>shows that models like o1 are necessary.

But HOW they are necessary is the change. They went from building blocks to stepping stones. From a business standpoint that's very damaging to OAI and other players.

KingOfCoders1y ago

OpenAI couldn't do it, when the high cost of training and access to GPUs is their competitive advance against startups, they can't admit that it does not exist.

patcon1y ago

Are we it rediscovering the evolutionary benefit of progeny (from an information theoretic lens)?

And is this related to the lottery ticket hypothesis?

https://arxiv.org/pdf/1803.03635.pdf

herodoturtle1y ago

Thanks for the insightful comment.

I have a question (disclaimer: reinforcement learning noob here):

Is there a risk of broken telephone with this?

Kinda like repeatedly compressing an already compressed image eventually leads to a fuzzy blur.

If that is the case then I’m curious how this is monitored and / or mitigated.

ospray1y ago

They did do that themselves it's called o3.

RHSman21y ago

When will over training happen on the melange of models at scale? And will AGI only ever be an extension of this concept?

That is where artificial intelligence is going. Copy things from other things. Will there be a AI Eureka moment where it deviates and knows where and why the reason it is wrong?

indymike1y ago

Bad things happen in tech when you don't do the disrupting yourself.

anothernewdude1y ago

If they're training R1 on o1 output on the benchmarks - then I don't trust those benchmarks results for R1. It means the model is liable to be brittle, and they need to prove otherwise.

dontreact1y ago

Is there any evidence R1 is better than O1?

It seems like if they in fact distilled then what we have found is that you can create a worse copy of the model for ~5m dollars in compute by training on its outputs.

iforgot221y ago

"Then use R1 output to build a better X1" is the part I'm not sure about. Is X1 going to actually be better than R1?

qwertox1y ago

unreal371y ago

I thought OpenAI GPT took Wikipedia and the content of every book as inputs to train their models?

Everyone is standing on the shoulders of giants.

qwertox1y ago

1 more reply

bigfudge1y ago

How do you think manufacturing in the US got started? Everyone is on someone’s shoulders.

dartos1y ago

What does “better” really even mean here?

Better benchmark scores can be cooked

Sophira1y ago

Honestly, it's kind of silly that this technology is in the hands of companies whose only aim is to make money, IMO.

lenerdenator1y ago

Well, originally, OpenAI wasn't supposed to be that kind of organization.

But if you leave someone in the tech industry of SV/SF long enough, they'll start to get high on their own supply and think they're entitled to insane amounts of value, so...

goatlover1y ago

It's because they're the ones who could raise the money to make those models. Academics don't have access to that kind of compute. But the free models exist.

gmd631y ago

Why not just copy and paste the model and change the name? That's an even more efficient form of distillation.

wgjordan1y ago

j / k navigate · click thread line to collapse