undefined | Better HN

0 pointsthe_af6mo ago0 comments

This seems like begging the question to me.

Why do you think it's not a good heuristic to be able to quickly spot the tell-tale signs of LLM involvement, before you've wasted time reading slop?

Yes, there will be false positives. It's a heuristic after all.

0 comments

buu7006mo ago

Because the false positive rate is unacceptably high — we're talking about a standard, widely used character — and because if the heuristic becomes widespread enough to matter, then it will be trivially circumvented by bad actors anyway. Who is it helping if we collectively bully ourselves into excising a perfectly good punctuation mark from human language?

If anything, I'd rather that renderers like Markdown just all agree to change " - " to an en dash and " -- " to an em dash. Then we could put the matter to bed once and for all.

the_afOP6mo ago

Oh no, I'm not advocating ditching em-dahses. I love them -- the form I use, anyway.

I was just curious why you've decided paying attention to them is a bad heuristic. Sure, it can change once people instruct their LLMs not to use them, but still, for now, they sure seem to overuse them!

That and "let's unpack this". I swear, I'll forbid ChatGPT from using "unpack" ever again, in any context!

buu7006mo ago

That's fair. It's not like I don't pay attention to it myself. It's more that I wouldn't never use presence of em dashes in the absence of any other heuristics to predict whether or not something is LLM-generated, and it's a practically useless signal either way because I also wouldn't assume that content that used hyphens in place of dashes wasn't LLM-generated.

So the only real purpose of the heuristic is to add a tiny extra vote of confidence when I see a comment that otherwise appears to be lazy ChatGPT copypasta, but in such cases I'll predict that it was probably LLM output either way, and I'll judge that it appears to be poor writing that isn't worth my time regardless of whether or not an LLM was involved.

Fundamentally, the issue I'm seeing here is that we're all talking over each other because we need a better standardized term than "LLM output". I suppose "slop" could work if we universally that it referred only to a subset of LLM output, rather than being synonymous with LLM output in general, but I'm not sure that we do universally agree on that.

If someone types the equivalent of a Google search into ChatGPT, or a spammer has an automated process generically reply to social media posts/comments, that's what qualifies to me as "slop". Most of us here have seen it in the wild by now, and there's obviously a distinctive common style (at least for now), and I think we can all agree that it sucks. That's very different from someone investing time and/or expertise to produce content that just happens to involve an LLM as one of the tools in their arsenal; the attitude it isn't is just the modern equivalent of considering cellular phone calls or typed letters to be "impersonal".

I'm not suggesting that LLM output doesn't tend to have a higher density of em dashes than human output. I'm just pushing back on the idea that presence of em dashes is sufficient evidence to dismiss something as probably-LLM-generated, which is no better than superstition. I mean, I've used em dashes in a number of comments in this thread, and no one has accused me of using an LLM, so it can't be a pattern that anyone puts too much stock in.

lmm6mo ago

> the false positive rate is unacceptably high — we're talking about a standard, widely used character

Citation needed.

> Who is it helping if we collectively bully ourselves into excising a perfectly good punctuation mark from human language?

Humans can adapt faster than LLM companies, at least for the moment. We need to be willing to play to our strengths.

Who is it helping if we bully ourselves into ignoring a simple, easy "tell"?

buu7006mo ago

Citation needed.

https://en.wikipedia.org/wiki/Dash

Humans can adapt faster than LLM companies

No one said anything about LLM companies. If I were a spammer today, I'd just have my code replace dashes in LLM output with hyphens before posting it. As a human, I'm not going to suddenly stop using dashes because a handful of people are treating a silly meme as if it were a genuinely useful heuristic.

1 more reply

j / k navigate · click thread line to collapse

0 comments

buu7006mo ago

If anything, I'd rather that renderers like Markdown just all agree to change " - " to an en dash and " -- " to an em dash. Then we could put the matter to bed once and for all.

the_afOP6mo ago

Oh no, I'm not advocating ditching em-dahses. I love them -- the form I use, anyway.

That and "let's unpack this". I swear, I'll forbid ChatGPT from using "unpack" ever again, in any context!

buu7006mo ago

lmm6mo ago

> the false positive rate is unacceptably high — we're talking about a standard, widely used character

Citation needed.

> Who is it helping if we collectively bully ourselves into excising a perfectly good punctuation mark from human language?

Humans can adapt faster than LLM companies, at least for the moment. We need to be willing to play to our strengths.

Who is it helping if we bully ourselves into ignoring a simple, easy "tell"?

buu7006mo ago

Citation needed.

https://en.wikipedia.org/wiki/Dash

Humans can adapt faster than LLM companies

1 more reply

j / k navigate · click thread line to collapse