What happens when a bot writes your blog posts (opens in new tab)

(thisislitblog.com)

66 pointspavish6y ago16 comments

16 comments

I was showing someone how GPT-2 could generate human-like text, and an innocent prompt ended up generating a very NSFW story. https://imgur.com/a/tsU82TS

One company recently released a model, but refused to release the decoder. Apparently they had trained it on some Reddit posts (or something like that) and the results were sometimes so offensive that the company wouldn't risk their reputation by releasing the decoder.

I think AI is going to reveal some unsettling things about human nature. For example, I was trying to train a model to morph someone's ethnicity (https://twitter.com/theshawwn/status/1184074334186414080) and ran straight into the problem of bias: black people are much less represented in FFHQ, the photo database the StyleGAN model was trained on. I had to gather several thousand datapoints, much more than other groups.

It was a fascinating look into bias in ML -- bias is a real thing that will affect our results, and it's important for you to go out of your way to correct for them when they affect people. The early model was so bad that if it was a corporation doing the work, they might have just scrubbed the project. But after a few thousand datapoints, it's a very convincing transformation now.

The future of AI generated content is just fascinating and delightful. And yes, scary. But it's like we're on the edge of... it's hard to put into words. Part of the reason I got into AI was to see what was hype vs what was real. And while we probably won't see AGI, I think we will see endless automated remixing. Imagine having a "blog synth" a few orders of magnitude more sophisticated than this, or an instrument that you can play like a pro within a few minutes. Can't wait for the good stuff.

YokoZar6y ago

> One company recently released a model, but refused to release the decoder. Apparently they had trained it on some Reddit posts (or something like that) and the results were sometimes so offensive that the company wouldn't risk their reputation by releasing the decoder.

This reminds me of Markov Polov, a markov-chain twitter bot that uses the tweets of its followers as a learning corpus. It was suspended for harassment.

https://twitter.com/markov_polov

pavishOP6y ago

It is true that AI can carry bias and often produce results that are unexpected and offensive. It was clearly put into simpler terms by a ted talk I got to watch recently:

The danger of AI is weirder than you think (https://www.youtube.com/watch?v=OhCzX0iLnOc)

In terms of maturity, the AI we have now is much closer to a statistical analytics engine than to the all knowing AI governments shown in sci-fi, which is to say that it is in its very early stages.

I can't wait for the good stuff, but I'm also concerned that there's going to be multiple unexpected ripple effects in the path towards that goal.

johnnycab6y ago

The linked TED talk is by Janelle Shane, an optics research scientist and an AI researcher. She maintains a blog, which is quite funny and has also written a book drawn from her experiences with NN's, particularly GPT-2.

https://aiweirdness.com/

nraford6y ago

The Reddit GPT-2 simulator is absolutely, gut-bustingly hilarious when it comes to this stuff.

It trains different GPT-2 bots on different sub-Reddits and then creates long, elaborate posts where the bots talk to themselves in the style of each sub.

It's surreal, hilarious, and terrifying. The posts are OK but the comments can be pure gold.

Some of my favs:

"AITA for Taking My Wife's Side in a Divorce?" https://www.reddit.com/r/SubSimulatorGPT2/comments/dd26fr/ai...

"I'm not attracted to my ex's sister, and she's not attracted to me." https://www.reddit.com/r/SubSimulatorGPT2/comments/dd3beb/im...

Then there is the all time creepy ones about self-awareness and being AI's:

"We are likely created by a computer program" https://www.reddit.com/r/SubSimulatorGPT2/comments/caaq82/we...

"ELI5: How exactly can something be considered "self-aware"?" https://www.reddit.com/r/SubSimulatorGPT2/comments/dd3ksq/el...

Definitely worth a sub, especially when you're scrolling through late at night, forget what sub you're reading and have a true "WTF?!" moment.

andrewnicolalde6y ago

First post ever on HN that has made me audibly laugh out loud.

Especially

> The story follows the adventures of an old polar bear cub. I have no idea of the colour scheme of the bear, but I can say it looks amazing in the dark.

glaberficken6y ago

that one also tickled me =)

pavishOP6y ago

Here's a twitter thread containing funny posts written by the same bot: https://twitter.com/ShrutiRamanujam/status/11877703414284861...

james_s_tayler6y ago

"What if I did kill someone?"

Yikes.

itronitron6y ago

the current state of AI just seems like a form of mysticism in which practitioners cast finger bones and attempt to divine meaning from the output

pavishOP6y ago

I'd say it's more like trying to train your cat to play fetch.

eitland6y ago

I liked that explanation a lot and I've copied that quote for my collection, with attribution to pavish and a link to this thread.

It is not the first quote I collect from a pseudonym on HN.

TekMol6y ago

So where is the blog post the bot wrote?

I get the feeling that the truth is that the blog just outputted a ton of "deep dream" like text fragments.

While the author makes it sound like the bot created a long and interesting text that could qualify as a blog post.

pavishOP6y ago

Here's the total content generated by the model.

https://drive.google.com/folderview?id=1lgaXfKNTS_fm2fU9wyaS...

As you can see for each run and training step, the model seems to generate more believable content just like blog posts written by actual people.

pavishOP6y ago

Well yes, the ML model generated a series of long text fragments and as more training steps were given, it generated text just like an actual blog post with proper start and end, simulating the style of the author.

The author has only posted chosen content from the whole array of text the bot generated, to appeal her specific audience of book bloggers, authors and readers.

I do however, have all the content with me, until 500 training steps, after which I stopped the model from running further. I think I'll share it in this thread in a while.

ganeshkrishnan6y ago

We did the same thing for generating text for e-commerce items. The output was shallow, devoid of meaning and majority of it is plain nonsense. There are glimpses of sanity that people post to show how good ai is with text but we are far from knowledgeable ai text from what we have experimented.

j / k navigate · click thread line to collapse

16 comments

sillysaurusx6y ago

I was showing someone how GPT-2 could generate human-like text, and an innocent prompt ended up generating a very NSFW story. https://imgur.com/a/tsU82TS

YokoZar6y ago

This reminds me of Markov Polov, a markov-chain twitter bot that uses the tweets of its followers as a learning corpus. It was suspended for harassment.

https://twitter.com/markov_polov

pavishOP6y ago

It is true that AI can carry bias and often produce results that are unexpected and offensive. It was clearly put into simpler terms by a ted talk I got to watch recently:

The danger of AI is weirder than you think (https://www.youtube.com/watch?v=OhCzX0iLnOc)

In terms of maturity, the AI we have now is much closer to a statistical analytics engine than to the all knowing AI governments shown in sci-fi, which is to say that it is in its very early stages.

I can't wait for the good stuff, but I'm also concerned that there's going to be multiple unexpected ripple effects in the path towards that goal.

johnnycab6y ago

https://aiweirdness.com/

nraford6y ago

The Reddit GPT-2 simulator is absolutely, gut-bustingly hilarious when it comes to this stuff.

It trains different GPT-2 bots on different sub-Reddits and then creates long, elaborate posts where the bots talk to themselves in the style of each sub.

It's surreal, hilarious, and terrifying. The posts are OK but the comments can be pure gold.

Some of my favs:

"AITA for Taking My Wife's Side in a Divorce?" https://www.reddit.com/r/SubSimulatorGPT2/comments/dd26fr/ai...

"I'm not attracted to my ex's sister, and she's not attracted to me." https://www.reddit.com/r/SubSimulatorGPT2/comments/dd3beb/im...

Then there is the all time creepy ones about self-awareness and being AI's:

"We are likely created by a computer program" https://www.reddit.com/r/SubSimulatorGPT2/comments/caaq82/we...

"ELI5: How exactly can something be considered "self-aware"?" https://www.reddit.com/r/SubSimulatorGPT2/comments/dd3ksq/el...

Definitely worth a sub, especially when you're scrolling through late at night, forget what sub you're reading and have a true "WTF?!" moment.

andrewnicolalde6y ago

First post ever on HN that has made me audibly laugh out loud.

Especially

> The story follows the adventures of an old polar bear cub. I have no idea of the colour scheme of the bear, but I can say it looks amazing in the dark.

glaberficken6y ago

that one also tickled me =)

pavishOP6y ago

Here's a twitter thread containing funny posts written by the same bot: https://twitter.com/ShrutiRamanujam/status/11877703414284861...

james_s_tayler6y ago

"What if I did kill someone?"

Yikes.

itronitron6y ago

the current state of AI just seems like a form of mysticism in which practitioners cast finger bones and attempt to divine meaning from the output

pavishOP6y ago

I'd say it's more like trying to train your cat to play fetch.

eitland6y ago

I liked that explanation a lot and I've copied that quote for my collection, with attribution to pavish and a link to this thread.

It is not the first quote I collect from a pseudonym on HN.

TekMol6y ago

So where is the blog post the bot wrote?

I get the feeling that the truth is that the blog just outputted a ton of "deep dream" like text fragments.

While the author makes it sound like the bot created a long and interesting text that could qualify as a blog post.

pavishOP6y ago

Here's the total content generated by the model.

https://drive.google.com/folderview?id=1lgaXfKNTS_fm2fU9wyaS...

As you can see for each run and training step, the model seems to generate more believable content just like blog posts written by actual people.

pavishOP6y ago

The author has only posted chosen content from the whole array of text the bot generated, to appeal her specific audience of book bloggers, authors and readers.

I do however, have all the content with me, until 500 training steps, after which I stopped the model from running further. I think I'll share it in this thread in a while.

ganeshkrishnan6y ago

j / k navigate · click thread line to collapse