Needing to flag nontrivial code as generated was standard practice for my whole career.
AI and humans are not the same as authors of PRs. As an obvious example: one of the important functions of the PR process is to teach the writer about how to code in this project but LLMs fundamentally don't learn the same way as humans so there's a meaningful difference in context between humans and AIs.
If a human takes the care to really understand and assume authorship of the PR then it's not really an issue (and if they do, they could easily modify the Claude messages to remove "generated by Claude" notes manually) but instead it seems that Claude is just hiding relevant context from the reviewer. PRs without relevant context are always frustrating.
How? LSTM?
Arguably snippet collections belong to this genre.
The following comment in the blog post
//go:generate stringer -type=Pill
generates a .._string.go file which contains a '.String()' method.I would find it very reasonable to commit that with 'Co-Authored-By: stringer v0.1.0' or such.
Or 'sed s/a/b/g' and 'Co-Authored-By: sed'
How about compiler?
Similarly, if I use e.g. jextract or uniffi to generate Java interfaces from C code and check that in, I'll create tooling to automatically run those, and the commit will be attributed to that tooling.
If this is not the case you should not be sending it to public repos for review at all. It is rude and insulting to expect the people maintaining these repos to review code that nobody bothered to read.
The difference here is that the generator is a non-deterministic LLM and you can't reason about its output the same way.
As for LLM code assistants, I don't really view them as traditional code generation tools in the first place, as in practice they more resemble something in between autocomplete and delegating to a junior programmer.
As for attribution, I view it more or less the same way as "dictated but not read" in written correspondance, i.e., an disclaimer for errors in the code, which may be considered rude in some contexts, and a perfectly acceptable and useful annotation in others.
No. I don't want to test and pick through your shitty LLM generated code. If I wanted the entire code base to be junk, it'd say so in the readme.
This is not at all the case with LLM-generated code - mostly because you can't regenerate it even if you wanted to, as it's not deterministic.
That said, I do agree that LLM code is different enough from human code (even just in regards to potential copyright worries) that it should be mentioned that LLMs were used to create it.
Replace gRPC compiler with LLM. Can you reproduce? (probably not 100%). Can anybody fix it short of throwing more english phrases like "DO NOT", "NEVER", "Under No Circumstances"?
Probably not.
I thought the argument was that AI-users were reviewing and understanding all of the code?