undefined | Better HN

story

0 pointsKPGv21y ago0 comments

At the risk of being called rms, no, that's not what open source means. Open source just means you have access to the source code. Which you do. Code that is open source but restrictively licensed is still open source.

That's why terms like "libre" were born to describe certain kinds of software. And that's what you're describing.

This is a debate that started, like, twenty years ago or something when we started getting big code projects that were open source but encumbered by patents so that they couldn't be redistributed, but could still be read and modified for internal use.

0 comments

jefftk1y ago

> Open source just means you have access to the source code.

That's https://en.wikipedia.org/wiki/Source-available_software , not 'open source'. The latter was specifically coined [1] as a way to talk about "free software" (with its freedom connotations) without the price connotations:

The argument was as follows: those new to the term "free software" assume it is referring to the price. Oldtimers must then launch into an explanation, usually given as follows: "We mean free as in freedom, not free as in beer." At this point, a discussion on software has turned into one about the price of an alcoholic beverage. The problem was not that explaining the meaning is impossible—the problem was that the name for an important idea should not be so confusing to newcomers. A clearer term was needed. No political issues were raised regarding the free software term; the issue was its lack of clarity to those new to the concept.

[1] https://opensource.com/article/18/2/coining-term-open-source...

HDThoreaun1y ago

You dont get to redefine what "open" means.

jefftk1y ago

It's common for terms to have a more specific meaning when combined with other terms. "Open source" has had a specific meaning now for decades, which goes beyond "you can see the source" to, among other things, "you're allowed to it without restriction".

RobotToaster1y ago

So Swedish meatballs are any ball of meat made in Sweden?

And French fries are anything that was fried in France?

sumeno1y ago

Tell that to Sam Altman

esafak1y ago

He did not succeed, did he?

dTal1y ago

I don't know why you've been downvoted. This is a 100% correct history. "Open source" was specifically coined as a synonym to "free software", and has always been used that way.

sho_hn1y ago

> Open source just means you have access to the source code. Which you do.

No, they also fail even that test. Neither Meta nor DeepSeek have released the source code of their training pipeline or anything like that. There's very little literal "source code" in any of these releases at all.

What you can get from them is the model weights, which for the purpose of this discussion, is very similar to compiler binary executable output you cannot easily reverse, which is what open source seeks to address. In the case of Meta, this comes with additional usage limitations on how you may put them to use.

As a sibling comment said, this is basically "freeware" (with asterisks) but has nothing to do with open source, either according to RMS or OSI.

> This is a debate that started, like, twenty years ago

For the record, I do appreciate the distinction. This isn't meant as an argument from authority at all, but I've been an active open source (and free software) developer for close to those 20 years, am on the board of one of the larger FOSS orgs, and most households have a few copies of FOSS code I've written running. It's also why I care! :-)

nuancebydefault1y ago

The weights, which are part of the source, are open. Now you are arguing it not being open source because they don't provide the source for that part of the source. If you follow that reasoning you can ad infinitum claim the absence of sources since every source originates from something.

Kerbonut1y ago

The source is the training data and the code used to turn the training data _into_ the weights. Thus GP is correct, the weights are more akin to a binary from a traditional compiler.

nuancebydefault1y ago

To me this 'source' requirement does not make sense. It is not that you bring training data and the application together and press a train button, there's much more actions involved.

Also the training data is of a massive amount.

Additionally, what about human in the loop training, do you deliver humans as part of the source?

JumpCrisscross1y ago

> they also fail even that test. Neither Meta nor DeepSeek have released the source code of the

This debate is over and makes the open source community look silly. Open model and weights is, practically speaking, open source for LLMs.

I have tremendous respect for FOSS and those who build and maintain it. But arguing for open training data means only toy models can practically exist. As a result, the practical definition will prevail. And if the only people putting forward a practical definition are Meta et al, this is what you get: source available.

sho_hn1y ago

I'm not arguing for open training data BTW, and the problem is exactly this sort of myopic focus on the concerns of the AI community and the benefits of open-washing marketing.

Completely, fully breaking the meaning of the term "open source" is causing collateral damage outside the AI topic, that's where it really hurts. The open source principle is still useful and necessary, and we need words to communicate about it and raise correct expectations and apply correct standards. As a dev you very likely don't want to live in a tech environment where we regress on this.

It's not "source available" either. There's no source. It's freeware.

"I can download it and run it" isn't open source.

I'm actually not too worried that people won't eventually re-discover the same needs that open source originally discovered, but it's pretty lame if we lose a whole bunch of time and effort to re-learn some lessons yet again.

JumpCrisscross1y ago

> it's pretty lame if we lose a whole bunch of time and effort to re-learn some lessons yet again

We need to relearn because we need a different definition for LLMs. One that works in practice, not just at the peripheries.

Maybe we can have FOSS LLMs vs open-source ones, like we do with software licenses. The former refers to the hardcore definition. The latter the practical (and widely used) one.

2 more replies

j / k navigate · click thread line to collapse

0 comments

jefftk1y ago

> Open source just means you have access to the source code.

[1] https://opensource.com/article/18/2/coining-term-open-source...

HDThoreaun1y ago

You dont get to redefine what "open" means.

jefftk1y ago

RobotToaster1y ago

So Swedish meatballs are any ball of meat made in Sweden?

And French fries are anything that was fried in France?

sumeno1y ago

Tell that to Sam Altman

esafak1y ago

He did not succeed, did he?

dTal1y ago

I don't know why you've been downvoted. This is a 100% correct history. "Open source" was specifically coined as a synonym to "free software", and has always been used that way.

sho_hn1y ago

> Open source just means you have access to the source code. Which you do.

As a sibling comment said, this is basically "freeware" (with asterisks) but has nothing to do with open source, either according to RMS or OSI.

> This is a debate that started, like, twenty years ago

nuancebydefault1y ago

Kerbonut1y ago

The source is the training data and the code used to turn the training data _into_ the weights. Thus GP is correct, the weights are more akin to a binary from a traditional compiler.

nuancebydefault1y ago

To me this 'source' requirement does not make sense. It is not that you bring training data and the application together and press a train button, there's much more actions involved.

Also the training data is of a massive amount.

Additionally, what about human in the loop training, do you deliver humans as part of the source?

JumpCrisscross1y ago

> they also fail even that test. Neither Meta nor DeepSeek have released the source code of the

This debate is over and makes the open source community look silly. Open model and weights is, practically speaking, open source for LLMs.

sho_hn1y ago

I'm not arguing for open training data BTW, and the problem is exactly this sort of myopic focus on the concerns of the AI community and the benefits of open-washing marketing.

It's not "source available" either. There's no source. It's freeware.

"I can download it and run it" isn't open source.

JumpCrisscross1y ago

> it's pretty lame if we lose a whole bunch of time and effort to re-learn some lessons yet again

We need to relearn because we need a different definition for LLMs. One that works in practice, not just at the peripheries.

Maybe we can have FOSS LLMs vs open-source ones, like we do with software licenses. The former refers to the hardcore definition. The latter the practical (and widely used) one.

2 more replies

j / k navigate · click thread line to collapse