undefined | Better HN

0 pointslolinder1y ago0 comments

So people have said, but as I've noted I disagree with the characterization that training is equivalent to compilation. Even the companies that can afford to train a foundation model do so once and then fine-tune it from there to modify it. They only start a new training run when they're building a brand new model with totally different characteristics (such as a different parameter count).

Training is too expensive for the training data to be the preferred form for making modifications to the work. Given that, the weights themselves are the closest thing these things have to "source code".

And this is where the reproducibility argument falls apart: on what basis can we insist that the preferred form for modifying an LLM (the weights) must be reproducible to be open source but the preferred form for modifying a piece of regular software (the code) can be open sourced as is, with none of the processes used to produce the code?

0 comments

3 comments · 1 top-level

fragmede1y ago· 2 in thread

Just because I can hex edit Photoshop.exe to say what I want in the about dialog doesn't make it open source, even if it is faster and easier to hexedit it than it is to recompile from source.

In order for the weights to take all the training data and embed it in the model, by definition, some data must be lost. That data can't be recovered, no matter how much you fine tune the model. Because we can't, we don't know how alignment gets set, or the extent of it.

The closet thing these things have to source code is the source code and training data used to create the model. Because that's what's used to created the model. How big a system is necessary to train it doesn't factor in. It used to take many days to compile the Linux kernel, and many people at the time didn't have access to systems that could even compile it.

lolinderOP1y ago

> Just because I can hex edit Photoshop.exe to say what I want in the about dialog doesn't make it open source, even if it is faster and easier to hexedit it than it is to recompile from source.

First, licenses matter. Photoshop.exe is closed source first and foremost because the license says so.

Second and more importantly for this discussion, Adobe doesn't prefer to work with hexedit, they prefer to work with the source code.

OpenAI prefers to fine tune their existing models rather than train new ones. They fine tune regularly, and have only trained from scratch four times total, with each of those being a completely new model, not a modification.

That means the weights of an LLM are the preferred form for modification, which meets the GPL's definition of 'source code':

> The “source code” for a work means the preferred form of the work for making modifications to it.

fsflover1y ago

> OpenAI prefers to fine tune their existing models rather than train new ones.

This "preference" is solely based on the cost, not convenience, unlike in the GPL definition. The cost is going to change fast and should not be a part of the definition.

j / k navigate · click thread line to collapse