I don't understand why everything changes as soon as an LLM is involved. An LLM is just software.
I'm not even sure how one would construct a viable legal argument around that for SOTA models + harnesses, given the amount of creative choices that go into building them.
It'd be something like "Yes, we spent billions of dollars and thousands of person-hours creating these things, but none of that creative effort was responsible for or influenced this particular illegal choice the model made."
And they're caught between a rock and a hard place, because if they cripple initiative, they kill their agentic utility.
Ultimately, this will take a DMCA Section 512-like safe harbor law to definitively clear up: making it clear that outcomes from LLMs are the responsibility of their prompting users, even if the LLM produces unintended actions.
I'm not a lawyer, but to me the legal case seems pretty obvious. "We spent billions of dollars creating this thing to be a good programmer, but we did not intend for it to reverse engineer Oracle's database. No creative effort was spent making it good at reverse engineering Oracle's database. The model reverse-engineered Oracle's database because the user directed it to do so."
If merely fine-tuning an LLM to be good at reverse engineering is enough to be found liable when a user does something illegal, what does that mean for torrent clients?
That's the bit that's going to be nasty in evidence. 'So you didn't have any reverse engineering in your training or testing sets?'
So if I ask “how does a real world production quality database implement indexes?” And it says “I disassembled Oracle and it does XYZ” then I am liable and owe Oracle a zillion dollars?
Whereas if I caveat “you may look at the PostgreSQL or SQLite or other free database engine source code, or industry studies, academic papers; you may not disassemble anything or touch any commercial software” - if it does, I’m still liable?
Who would dare use an LLM for anything in those circumstances?