Establishing an etiquette for LLM use on Libera.Chat (opens in new tab)

(libera.chat)

58 pointseaseout1y ago54 comments

54 comments

22 comments · 5 top-level

aspenmayer1y ago· 10 in thread

HN would benefit from a specific, explicit policy such as this.

tptacek1y ago

We have an explicit policy: you can't post LLM stuff directly to HN.

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

throwaway3141551y ago

Doesn't seem very explicit if you have to search a mod's comment history to find it.

2 more replies

aspenmayer1y ago

This policy should manifest itself in the Guidelines, if HN users are expected to know about it and adhere to it.

1 more reply

jsheard1y ago

The community already seems to have established a policy that copy pasting a block of LLM text into a comment will get you downvoted into oblivion immediately.

aspenmayer1y ago

That rubric only works until sufficiently advanced LLM-generated HN posts are indistinguishable from human-generated HN posts.

It also doesn’t speak to the permission or lack thereof of training LLMs on HN content, which was another main point of OP.

4 more replies

fenomas1y ago

Sure, and I think the reason is that whatever else they are, LLM outputs are disposable. Posting them here is like posting outputs from Math.random() - anyone who wants such outputs can easily generate their own.

Der_Einzige1y ago

Bold of you to assume that you will have any idea at all that an LLM generated a particular comment.

If I take a trick like those recommend by the authors of min_p (high temperature + min_p)[1], I do a great job of escaping the "slop" phrasing that is normally detectable and indicative of an LLM. Even more-so if I use the anti-slop sampler[2].

LLMs are already more creative than humans are today, they're already better than humans at most kinds of writing, and they are coming to a comment section near you.

Good luck proving I didn't use an LLM to generate this comment. What if I did? I claim that I might as well have. Maybe I did? :)

[1] https://openreview.net/forum?id=FBkpCyujtS

[2] https://github.com/sam-paech/antislop-sampler, https://github.com/sam-paech/antislop-sampler/blob/main/slop...

1 more reply

benatkin1y ago

Nope.

> LLMs are allowed on Libera.Chat. They may both take input from Libera.Chat and output responses to Libera.Chat.

This wouldn't help HN.

Nor would the opposite policy, if only because it would encourage accusatory behavior.

aspenmayer1y ago

I have asked dang to comment on this issue specifically in the context of this post/thread.

The “opposite policy” is sort of the current status quo, per dang:

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

See this thread for my own reasoning on the issue (as well as dang’s), as it was raised recently:

https://news.ycombinator.com/item?id=41937993

You’ll need showdead enabled on your profile to see the whole thread, which speaks to the controversial nature of this issue on HN.

I agree that your mention of “encouraging accusatory behavior” is a point well-taken, and in the absence of evidence, such accusations themselves would likely run afoul of the Guidelines, but it’s worth noting that dang has said that LLM output itself is generally against the Guidelines, which could lead to a feedback loop of disinterested parties posting LLM content, only to be confronted with interested parties posting uninteresting takedowns of said LLM content and posters of it.

No easy answers here, I’m afraid.

2 more replies

t-writescode1y ago

The odds of LLMs being used to produce content on HN is a number approaching 100%.

The odds of LLMs being trained / queried against data scraped from HN or HNSearch is even closer to 100%.

I know you don't like the "LLMs are allowed..." part, but they're here and they literally cannot be gotten rid of. However, this rule,

> As soon as possible, people should be made aware if they are interacting with, or their activity is being seen by, a LLM. Consider using line prefixes, channel topics, or channel entry messages.

Could be something that is strongly encouraged and helpful, and possibly the "good" LLM users would follow it.

superkuh1y ago· 3 in thread

Mostly it's just formalizing of the established status quo. But the changes re: allowing training on chat logs has caused some unintended consequences.

For one, now the classic IRC megahal bots which have been around for decades are technically not allowed unless you get permission from Libera staff (and the channel ops). They are markov chains that continuously train on channel contents as they operate.

But hopefully, as in the past, the Libera staffers will intelligently enforce the spirit of the rules and avoid any silly situations like the above caused by imprecise language.

comex1y ago

By its wording, the policy is specifically about training LLMs. A classic Markov chain may be a language model, but it’s not a large language model. The same rules might not apply.

superkuh1y ago

Yeah, you'd think, but this one was run by the staff in #libera the other night after the announcement and it sounded like they believed markovs technically counted. But I imagine as long as no one is rocking the boat they'll be left alone. Perhaps there was some misunderstanding on my part.

martin-t1y ago

A classic example of a community self regulating until overwhelmed at which point rules are imposed which bad previously accepted and harmless behavior.

Rules must take scale into account and do it explicitly to avoid selective enforcement.

There's a difference between one person writing a simple bot and a large corporation offering a bot pretending to be human to everyone. The first is harmless and fun, the second is a large scale for-profit behavior with proportionally large negative externalities.

fjdjshsh1y ago· 3 in thread

I strongly believe it should be illegal to post something automatically by an LLM without clearly identifying it as such. I hope countries start passing these laws soon

conception1y ago

Why pass a law that’s completely unenforceable?

theamk1y ago

It is somewhat enforceable.

Sure, no one is going to go after random reddit post, but if a Major Newspaper wants to have AI write their articles, this would have to be labeled. And if your bank gets LLM support agent, it can no longer pretend to be human. All very desireable outcomes IMHO.

karlgkk1y ago

It's not unenforceable at all. Major players would be forced to abide by it, smaller players would reduce their use of LLMs, and not marking LLM content would be a bannable offense on most platforms.

1 more reply

ranger_danger1y ago· 1 in thread

Now can libera please establish etiquette for channel mods? All the biggest channels have extremely toxic, egotistical mods with god complexes visible from space.

aspenmayer1y ago

Have you seen examples of such codes of conduct in the IRC context before? Closest thing I can think of maybe is SDF’s or other shared systems’, but such rules seem somewhat quaint compared to norms on IRC.

Speaking of SDF, here’s their bot policy:

https://sdf.org/?faq?CHAT?01

> [01] CAN I RUN AN IRC BOT HERE??

> IRC BOTs are pretty intensive and most systems and networks ban them.

> In an experiment conducted in 1996 on this system, we allowed users to compile and run their bots. The result was hundreds of megs of disk space became occupied because each user insisted on having their own version of eggdrop uncompressed and untarred in their home directory. All physical memory was in use as ~45 eggdrop processes were running concurrently. The system was basically USELESS and it took 1.5 hours to login if you were patient enough (even from the system console).

> The ARPA members called a vote on the issue and the result was almost a resounding unanimous NO.

> However, there are times when running a bot is useful, for instance keeping a channel open, providing information or just logging a channel. Basically the bot policy here is a bit relaxed for MetaARPA members. Common sense is the rule. As long as you aren't running a harmful process, such as a hijack bot, warez bot or connecting to a server that does not allow bots, then you may run a bot process.

More info about SDF for those who are curious:

https://en.wikipedia.org/wiki/SDF_Public_Access_Unix_System

bawolff1y ago

As far as i can tell, this policy is essentially - don't do anything with an llm that would get you banned if you did it manually as a human.

j / k navigate · click thread line to collapse

54 comments

22 comments · 5 top-level

aspenmayer1y ago· 10 in thread

HN would benefit from a specific, explicit policy such as this.

tptacek1y ago

We have an explicit policy: you can't post LLM stuff directly to HN.

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

throwaway3141551y ago

Doesn't seem very explicit if you have to search a mod's comment history to find it.

2 more replies

aspenmayer1y ago

This policy should manifest itself in the Guidelines, if HN users are expected to know about it and adhere to it.

1 more reply

jsheard1y ago

The community already seems to have established a policy that copy pasting a block of LLM text into a comment will get you downvoted into oblivion immediately.

aspenmayer1y ago

That rubric only works until sufficiently advanced LLM-generated HN posts are indistinguishable from human-generated HN posts.

It also doesn’t speak to the permission or lack thereof of training LLMs on HN content, which was another main point of OP.

4 more replies

fenomas1y ago

Der_Einzige1y ago

Bold of you to assume that you will have any idea at all that an LLM generated a particular comment.

LLMs are already more creative than humans are today, they're already better than humans at most kinds of writing, and they are coming to a comment section near you.

Good luck proving I didn't use an LLM to generate this comment. What if I did? I claim that I might as well have. Maybe I did? :)

[1] https://openreview.net/forum?id=FBkpCyujtS

[2] https://github.com/sam-paech/antislop-sampler, https://github.com/sam-paech/antislop-sampler/blob/main/slop...

1 more reply

benatkin1y ago

Nope.

> LLMs are allowed on Libera.Chat. They may both take input from Libera.Chat and output responses to Libera.Chat.

This wouldn't help HN.

Nor would the opposite policy, if only because it would encourage accusatory behavior.

aspenmayer1y ago

I have asked dang to comment on this issue specifically in the context of this post/thread.

The “opposite policy” is sort of the current status quo, per dang:

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

See this thread for my own reasoning on the issue (as well as dang’s), as it was raised recently:

https://news.ycombinator.com/item?id=41937993

You’ll need showdead enabled on your profile to see the whole thread, which speaks to the controversial nature of this issue on HN.

No easy answers here, I’m afraid.

2 more replies

t-writescode1y ago

The odds of LLMs being used to produce content on HN is a number approaching 100%.

The odds of LLMs being trained / queried against data scraped from HN or HNSearch is even closer to 100%.

I know you don't like the "LLMs are allowed..." part, but they're here and they literally cannot be gotten rid of. However, this rule,

> As soon as possible, people should be made aware if they are interacting with, or their activity is being seen by, a LLM. Consider using line prefixes, channel topics, or channel entry messages.

Could be something that is strongly encouraged and helpful, and possibly the "good" LLM users would follow it.

superkuh1y ago· 3 in thread

Mostly it's just formalizing of the established status quo. But the changes re: allowing training on chat logs has caused some unintended consequences.

But hopefully, as in the past, the Libera staffers will intelligently enforce the spirit of the rules and avoid any silly situations like the above caused by imprecise language.

comex1y ago

By its wording, the policy is specifically about training LLMs. A classic Markov chain may be a language model, but it’s not a large language model. The same rules might not apply.

superkuh1y ago

martin-t1y ago

A classic example of a community self regulating until overwhelmed at which point rules are imposed which bad previously accepted and harmless behavior.

Rules must take scale into account and do it explicitly to avoid selective enforcement.

fjdjshsh1y ago· 3 in thread

I strongly believe it should be illegal to post something automatically by an LLM without clearly identifying it as such. I hope countries start passing these laws soon

conception1y ago

Why pass a law that’s completely unenforceable?

theamk1y ago

It is somewhat enforceable.

karlgkk1y ago

It's not unenforceable at all. Major players would be forced to abide by it, smaller players would reduce their use of LLMs, and not marking LLM content would be a bannable offense on most platforms.

1 more reply

ranger_danger1y ago· 1 in thread

Now can libera please establish etiquette for channel mods? All the biggest channels have extremely toxic, egotistical mods with god complexes visible from space.

aspenmayer1y ago

Speaking of SDF, here’s their bot policy:

https://sdf.org/?faq?CHAT?01

> [01] CAN I RUN AN IRC BOT HERE??

> IRC BOTs are pretty intensive and most systems and networks ban them.

> The ARPA members called a vote on the issue and the result was almost a resounding unanimous NO.

More info about SDF for those who are curious:

https://en.wikipedia.org/wiki/SDF_Public_Access_Unix_System

bawolff1y ago

As far as i can tell, this policy is essentially - don't do anything with an llm that would get you banned if you did it manually as a human.

j / k navigate · click thread line to collapse