The HTTP protocol does not specify what is right and wrong. The fact a protocol encodes or permits a particular kind of behaviour does not mean that every use of the protocol is ethically justified. I am sure you would agree with me that "black people can't visit this server" would be such an unethical rule, even though HTTP permits you to enforce such a rule. So let's forget about the protocol for a minute.
Is it morally wrong to lie about your User Agent in order to visit a website. Well, that depends on whether it is legitimate for the server operator to discriminate according to the User Agent. If it is not legitimate, then lying about your User Agent to circumvent the restriction is morally justified.
So we are back at square one: is it legitimate for a server operator to discriminate what sort of a client is used to visit them. Since the service is public, the person is allowed to visit the service and to read the content. If the client is misbehaved in some way (some LLM scrapers are) then this is a legitimate difference. But if this is controlled for, so the LLM scraper can't be easily distinguished from a human doing the same thing, then the service is not harmed any more than would be ordinary. Therefore the discrimination is not legitimate.
Likewise, I may prevent certain user-agents to visit my site. If you - say, an AI megacorp - are intentionally spoofing the user-agent to appear as a user, you are also violating consent.
No you wouldn't be. Even if someone tells you not to visit your site, you have every legal right to continue visiting it, at least in the US.
Under common interpretation of the CFAA, there needs to be a formal mechanism of authorized access. E.g. you could be charged if you hacked into a password-protected area of someone's site. But if you're merely told "hey bro don't visit my site", that's not going to reach the required legal threshold.
Which is why crawlers aren't breaking the law. If you want to restrict authorization, you need to actually implement that as a mechanism by creating logins, restricting content to logged-in users, and not giving logins to crawlers.
LLM programs does not have human rights.
Computer programs don't take actions, people do. If I use a web browser, or scrape some site to make an LLM, that's me doing it, not the program. And I have human rights.
If you think training LLMs should be illegal, just say that. If you think LLM companies are putting an undue strain on computer networks and they should be forced to pay for it, say that. But don't act like it's a virtue to try and capriciously gatekeep access to a public resource.
For example - humans can learn, programs can't. The "learning" cop out for LLM-corpos shouldn't be accepted by anyone, let alone by law. Humans have a fair use carve out of the copyright laws, not because it's something axiomatic, it's because some humans with empathy have forced others to allow all humans a leeway in legally using other's IP works. Just because such law exist for humans, doesn't mean that random computer programs should be applicable to it. Scraping web for LLMs should not be considered "fair use" because a) it is clearly not (commercialized later) and b) programs aren't humans and don't have equal rights.
And the list goes on. Now, I do get that train has long left the station and we are all collectively living in the anecdote about stealing a bicycle and asking god for forgiveness. But that doesn't mean I agree with this state. I'm just shouting my displeasure towards that passing train cause I'm weird like that. It's like with climate change - we are doing nothing that matters, no one discusses what really matters and I just accepted that nothing will really change. Doesn't mean I like the situation.
PS: tl;dr - LLMs clearly should be legal, it's just simple code is all. LLM corporations who steal IP content without compensation to the authors should be illegal, but of course they won't ever be.
PPS: there is a huge, gigantic gap between a single person scraping a few thousand pages for a personal use, maybe even some small local commercial use (though that's a grey area already) and a billion dollar megacorp, intent on destroying everything of value for humans in the internet for profit.