A good definition of “truly open” is whether the exact same results can be reproduced by someone with no extra information from only what has been made available. If that is not possible, because the reproduction methodology is closed (a common reason, like in this case) then what has been made available is not truly open.
We can sit here and technically argue whether or not the subject matter violated some arbitrary “open source” definition but it still doesn’t change the fact that it’s not truly open in spirit
The parallel can be made with model weights being static assets delivered in their completed state.
(I favor the full process being released especially for scientific reproducibility, but this is an other point)
What if someone gave you a binary and the source code, but not a compiler? Maybe not even a language spec?
Or what if they gave you a binary and the source code and a fully documented language spec, and both of 'em all the way down to the compiler? BUT it only runs on special proprietary silicon? Or maybe even the silicon is fully documented, but producing that silicon is effectively out of reach to all but F100 companies?
It's turtles all the way down...
We already have a definition of open source. I don't see any reason to change it.
It's basically like giving people a binary program and calling it open source because the compiler and runtime used are open source.
Make no mistake, I am super grateful to OSI for their efforts and most of my code out there uses one of their licenses. I just think they are limited by the circumstances. Some things I consider open are not conforming to their licenses and, like here, some things that conform might not be really open.
All sorts of intangibles end up in open source projects. This isn’t a science experiment that needs replication. They’re not trying to prove how they came up with the image/code/model.
Look into Affero GPL. Images are inert static assets. Here we are talking about the back end engine. The fact that neural networks and model weights are non-von-neumann architecture doesn’t negate the fact that they are executable code and not just static assets!
By this logic any freely downloadable executable software (a.k.a. freeware) is also open source, even though they don't disclose all details on how to build it.
If I hand you a beer for free that’s freeware. If I hand you the recipe and instructions to brew the beer that is open source.
We muddy the waters too much lately and call “free” to use things “open source”.
Yeah, but what those "open source" models are is like you handing me a bottle of beer, plus the instructions to make the glass bottle. You're open-sourcing something, just not the part that matters. It's not "open source beer", it's "beer in an open-source bottle". In the same fashion, those models aren't open source - they're closed models inside a tiny open-source inference script.
The model weights in eg TensorFlow are the source code.
It is not a von-Neumann architecture but a gigabyte of model weights is the executable part, no less than a gigabyte of imperative code.
Now, the training of the model is akin to the process of writing the code. In classical imperative languages that code may be such spaghetti code that each part would be intertwined with 40 others, so you can’t just modify something easily.
So the fact that you can’t modify the code is Freedom 2 or whatever. But at least you have Freedom 0 of hosting the model where You want and not getting charged for it an exorbitant amount or getting cut off, or having the model change out from under you via RLHF for political correctnesss or whatever.
OpenAI has not even met Freedom Zero of FSR or OSI’s definition. But others can.
The model weights aren't source code. They are the binary result of compiling that source code.
The source code is the combination of the training data and configuration of model architecture that runs against it.
The model architecture could be considered the compiler.
If you give me gcc and your C code I can compile the binary myself.
If you give me your training data and code that implements your model architecture, I can run those to compile the model weights myself.