https://spdx.dev/learn/handling-license-info/
For anyone wondering if this (license information in source files) is necessary, I think the answer is "maybe". Some licenses (e.g. Apache 2) seem to be written such that the license itself requires the disclaimer, and even having copyright information (e.g. users that make substantial contribution adding the name of whoever is assigned the copyright for their contribution to the header) is a good idea. I used to be against this for aesthetic reasons, viewing it somewhat similarly to those annoying corporate email footers, but over time it's become more obvious to me that it not only is great for keeping the license very explicit everywhere but may also be legally a good idea. (IANAL.)
Nobody does that anymore. We have git, and before that we had SVN and CVS and so on.
The legal commentaries I know simply say "in principle that requirement is legally valid and you must do it this way, it practice no programmer seems to do that, so shrug".
Are you referencing the right license/clause? I don't actually see how GPLv2's clause 2. b would require this actually.
> Nobody does that anymore. We have git, and before that we had SVN and CVS and so on.
Yep, I will agree with you that pretty much nobody actually does this, and it does not seem like it is an obstacle so far, e.g. I have not seen legal contention over this, mostly just discussion.
And honestly, writing a mini-changelog definitely seems like overkill with version control, and perhaps in most cases version control metadata is a perfectly acceptable substitute. However, since the file(s) might be distributed outside of version control where the version control data might not be present (e.g. like a release tarball) having at least the copyright information in each file seems useful. Whether it satisfies the "prominent notice that the file is changed" requirement is actually not 100% certain, but I can't imagine it puts you in a worse position to do so.
> a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change.
while in GPLv3 it says
> a) The work must carry prominent notices stating that you modified it, and giving a relevant date.
Note that git/svn are not always relevant. In particular, it is not uncommon to distribute release code with the .git directory stripped - this does not excuse you from the requirement.
Having said that, I had a quick glance at Linux, and picking a file at random it did have a copyright header, but certainly not one that included a record of every change. https://github.com/torvalds/linux/blob/master/init/calibrate...
So it doesn't look like this is peculiar to v2. But it does seem like people don't follow the letter of the requirement. I wonder if the FSF has ever clarified this.
It's not necessary (for you). But if you want to share your work, it can be very important for those people you share it with!
The reason is quite simple. Some downstream user, who should be able to use your code, may not have it conveyed to them in the way you imagine. They might only receive a single file out of the project, since that's what they needed. Technically, the person who distributed the file to them has violated the license by not sending the license text in-band. This is not good for the community, since it produces confusion about who is allowed to use free software -- it should be everyone, not just people who understand the details of licensing.
That person might also just send parts of a file instead of the entire file with the license at the top.
I don’t believe this is actually true. These instructions are not in the actual license terms and conditions, but rather in a distinct information section after that. In GPL-3.0, “How to Apply These Terms to Your New Programs”; in Apache-2.0, “How to apply the Apache License to your work”. My understanding is that such prescriptions and sections are not normative.
From a pure copyright law perspective: no, you definitely don’t need to put the license in each source file. Copyright is automatic these many years, so you don’t even need a copyright line; and license can be (and practically always is) independent of the code. This would obviously¹ hold for warranty disqualification too. Bigger businesses may like to do it for their own convenience of license management, and individuals or groups may like to do it on their work in case individual files are lifted (… as distinct from smaller units, or even taking a file and stripping the header), but basically the era where this sort of thing could arguably be relevant as a mandate is long-past.
There’s even more definitely no need for a dozen lines. If you really want to put anything, one line for a copyright declaration and one line for SPDX-License-Identifier seems fair.
As for copyright lines⸺ bah, they’re such a bunch of drivel. The way people use year ranges, or just bump the year, it’s almost all such legal nonsense. As in, “if this stuff actually mattered, you’d probably have lost your copyright protection” nonsense. The fact of the matter is that copyright year stuff wasn’t designed for such easily-edited stuff as software. It was designed for “first edition copyright 1925; renewed 1935; second edition copyright 1945”, that kind of thing.
—⁂—
¹ “Obviously” here means what a normal person would mean; but I acknowledge that some jurisdictions sometimes hold positions that are obvious nonsense.
Now with that said...
> I don’t believe this is actually true. These instructions are not in the actual license terms and conditions, but rather in a distinct information section after that. In GPL-3.0, “How to Apply These Terms to Your New Programs”; in Apache-2.0, “How to apply the Apache License to your work”. My understanding is that such prescriptions and sections are not normative.
There are a few lines in Apache 2 itself which are normative and make reference to obligations regarding notices attached directly to source files. Most notably, see section 4.a[1]:
> You must cause any modified files to carry prominent notices stating that You changed the files; [...]
Of course, you can argue about what might qualify or not qualify here, e.g. maybe Git metadata is good enough, but then a tarball produced by your Git host of choice would suddenly violate the copyright license obligations.
> From a pure copyright law perspective: no, you definitely don’t need to put the license in each source file. Copyright is automatic these many years, so you don’t even need a copyright line; and license can be (and practically always is) independent of the code.
I have no idea what you mean by this. First of all, of course you don't have to put the entire copyright license into each source file; this is solely about copyright notices, which typically point to a more complete LICENSE/NOTICE file. Secondly,
> license can be (and practically always is) independent of the code.
I'm not sure what this means. Each file of a project either needs to be in the public domain or has to be covered under some kind of copyright license for anyone (aside from the original authors) to be able to distribute it. At least from a conceptual sense, the code and the copyright license are definitely not independent (This still holds with dual licensing schemes.)
> There’s even more definitely no need for a dozen lines. If you really want to put anything, one line for a copyright declaration and one line for SPDX-License-Identifier seems fair.
Some projects do basically just do this, but many of them do both. The SPDX identifier is great because it is machine-readable, and it might help you if you're ever in a situation where it's necessary.
> As for copyright lines⸺ bah, they’re such a bunch of drivel. The way people use year ranges, or just bump the year, it’s almost all such legal nonsense. As in, “if this stuff actually mattered, you’d probably have lost your copyright protection” nonsense. The fact of the matter is that copyright year stuff wasn’t designed for such easily-edited stuff as software. It was designed for “first edition copyright 1925; renewed 1935; second edition copyright 1945”, that kind of thing.
While registering copyrights or including copyright notices explicitly is not necessary to have protection under copyright law, my layperson understanding is that it does indeed afford you additional protection under law in some cases. My understanding is that since Berne Convention, pretty much anywhere on Earth anything you produce that's eligible for copyright protection does implicitly get it. However, if you ever actually wind up in court over copyright issues, the lack of clarity on what licenses go where could possibly create reasonable doubt. It's a lot of risk for what ultimately amounts to an aesthetic concern.
P.S.: Also, just so it's clear, I am mainly concerned about the copyright notices because they explicitly denote the copyright license, not because they denote the existence of copyright protection. This is especially nice to have when people contribute patches to your projects, so that it can be as explicit as possible that their contributions are under the same license as the original file.
I will stress again that I am not a legal professional, I don't study law, and at best I have spoken to people who do irregularly. However, I haven't found anyone that would disagree that it is a good idea to provide a full-fat copyright notice when possible. All else the same, it's just good hygiene at a cost of a kilobyte or so per file.
[1]: https://www.apache.org/licenses/LICENSE-2.0#redistribution
Is there any chance someone else could clear this up?
[0] https://spdx.org/licenses/
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...
Aside: what is with the contemporary obsession with injecting emojis into everything? If I wanted a picture book, well, I'd go read a picture book.
I'm working on tooling that involves automated reading of this info, and it's a lot easier if the tools don't have to do fuzzier matching.