https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
It also doesn’t speak to the permission or lack thereof of training LLMs on HN content, which was another main point of OP.
If I take a trick like those recommend by the authors of min_p (high temperature + min_p)[1], I do a great job of escaping the "slop" phrasing that is normally detectable and indicative of an LLM. Even more-so if I use the anti-slop sampler[2].
LLMs are already more creative than humans are today, they're already better than humans at most kinds of writing, and they are coming to a comment section near you.
Good luck proving I didn't use an LLM to generate this comment. What if I did? I claim that I might as well have. Maybe I did? :)
[1] https://openreview.net/forum?id=FBkpCyujtS
[2] https://github.com/sam-paech/antislop-sampler, https://github.com/sam-paech/antislop-sampler/blob/main/slop...
> LLMs are allowed on Libera.Chat. They may both take input from Libera.Chat and output responses to Libera.Chat.
This wouldn't help HN.
Nor would the opposite policy, if only because it would encourage accusatory behavior.
The “opposite policy” is sort of the current status quo, per dang:
https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
See this thread for my own reasoning on the issue (as well as dang’s), as it was raised recently:
https://news.ycombinator.com/item?id=41937993
You’ll need showdead enabled on your profile to see the whole thread, which speaks to the controversial nature of this issue on HN.
I agree that your mention of “encouraging accusatory behavior” is a point well-taken, and in the absence of evidence, such accusations themselves would likely run afoul of the Guidelines, but it’s worth noting that dang has said that LLM output itself is generally against the Guidelines, which could lead to a feedback loop of disinterested parties posting LLM content, only to be confronted with interested parties posting uninteresting takedowns of said LLM content and posters of it.
No easy answers here, I’m afraid.
The odds of LLMs being trained / queried against data scraped from HN or HNSearch is even closer to 100%.
I know you don't like the "LLMs are allowed..." part, but they're here and they literally cannot be gotten rid of. However, this rule,
> As soon as possible, people should be made aware if they are interacting with, or their activity is being seen by, a LLM. Consider using line prefixes, channel topics, or channel entry messages.
Could be something that is strongly encouraged and helpful, and possibly the "good" LLM users would follow it.
For one, now the classic IRC megahal bots which have been around for decades are technically not allowed unless you get permission from Libera staff (and the channel ops). They are markov chains that continuously train on channel contents as they operate.
But hopefully, as in the past, the Libera staffers will intelligently enforce the spirit of the rules and avoid any silly situations like the above caused by imprecise language.
Rules must take scale into account and do it explicitly to avoid selective enforcement.
There's a difference between one person writing a simple bot and a large corporation offering a bot pretending to be human to everyone. The first is harmless and fun, the second is a large scale for-profit behavior with proportionally large negative externalities.
Sure, no one is going to go after random reddit post, but if a Major Newspaper wants to have AI write their articles, this would have to be labeled. And if your bank gets LLM support agent, it can no longer pretend to be human. All very desireable outcomes IMHO.
Speaking of SDF, here’s their bot policy:
> [01] CAN I RUN AN IRC BOT HERE??
> IRC BOTs are pretty intensive and most systems and networks ban them.
> In an experiment conducted in 1996 on this system, we allowed users to compile and run their bots. The result was hundreds of megs of disk space became occupied because each user insisted on having their own version of eggdrop uncompressed and untarred in their home directory. All physical memory was in use as ~45 eggdrop processes were running concurrently. The system was basically USELESS and it took 1.5 hours to login if you were patient enough (even from the system console).
> The ARPA members called a vote on the issue and the result was almost a resounding unanimous NO.
> However, there are times when running a bot is useful, for instance keeping a channel open, providing information or just logging a channel. Basically the bot policy here is a bit relaxed for MetaARPA members. Common sense is the rule. As long as you aren't running a harmful process, such as a hijack bot, warez bot or connecting to a server that does not allow bots, then you may run a bot process.
More info about SDF for those who are curious: