Again, exceeding authorized access means using your authorized access to obtain information you were not "entitled" to. So the question is not 'were you authorized' but rather it is 'were you entitled' to that information? WTF 'entitled' means is another question entirely, but likely it is in the eye of the beholder. A jury decided Weev was not 'entitled' to the email addresses he downloaded from AT&T, and it's safe to assume we are not 'entitled' to free access to WSJ's content. So I would not rest your hopes on the "200 OK".
So, your example is...not an example?
>WTF 'entitled' means is another question entirely
No, in this case it is very clear: a request containing a particular user agent string is entitled. I have not tried this myself, but presumably you could verify that is the case by sending a request with the appropriate user agent.
Again I think you're confusing the fact someone could trick the server into delivering the content for free with WSJ intending to deliver their content to you for free. Since WSJ clearly intends their content to be delivered to only Googlebot for free and to users only if they pay, it is likely a jury would consider this a violation of CFAA.
A web server returning 200 OK is not ipso facto a guarantee the person making the request is not committing a crime. To give a more obvious example, if the request header contains a stolen authorization token. The law does not require the access control be non-trivial to defeat.
I don't like it, and I think the CFAA is seriously problematic, but it is the law and the Feds have been known to enforce it.
At least in the US, the law doesn't work that way. Decisions will quite often cite some other similar case which reached the opposite conclusion, but under different circumstances, because that other case's decision says something like "X, if it weren't for Y" or "Fortunately for the defendant, they didn't Z, so not X", or something. That isn't binding precedent for the judge to apply X, but it's a very strong sign that X would be reasonable.
A court case that says "Yes, this violates CFAA but we have to throw out the case because A, B, and C" is very strong reason to believe that, if the next prosecutors avoid A, B, and C, the next judge will say "Yes, this still violates CFAA."
(IANAL but I read court cases because I find it useful to understand my jurisdiction's legal system.)
> a request containing a particular user agent string is entitled.
The phrasing of the law is very clear that the word "entitled" applies to a person, not to a request. Stealing someone's password and using their account is definitely a violation of CFAA (see e.g. http://www.wiggin.com/16332). In such a case, the account used to log in is quite plainly "entitled" / "authorized;" that's how you get the data. But the person logging in is not "entitled".
> Likewise, implementations are encouraged not to use the product tokens of other implementations in order to declare compatibility with them, as this circumvents the purpose of the field. If a user agent masquerades as a different user agent, recipients can assume that the user intentionally desires to see responses tailored for that identified user agent, even if they might not work as well for the actual user agent being used.
That sure sounds like impersonating other user agents is allowed, but not encouraged. That is a clear distinction from being malformed.
Obtaining paywall-protected content by faking your user agent to purport yourself to be a Google Crawler is quite clearly fraudulent. This isn't a point for debate.
PS. To play along with the linguistic theme, can you provide a source for the definition of a malformed request? My original intent when using the word malformed was not to invoke it's technical definition but rather it's dictionary definition. But, having said that, I just had a 30 second Google hunt and couldn't find anything to corroborate your position.
That is demonstrably false. I, personally, can consider any request malformed unless it starts with the letter W. Regardless of what the standards say, I can think whatever I want to.
Similarly, the law can make up whatever rules IT wants to about the definition of "malformed". In that case, it pays some scant attention to things like standards, but mostly cares about "to the random guy-on-the-street (jury member or judge) did this seem like stealing". And there, I am afraid you lose.
Unfortunately, when the prosecutors come, they will only care about the legal process.
The legal world cares about how the law applies to the facts of the case, not about how common sense applies.
Not saying I like it.