undefined | Better HN

0 pointspants22y ago0 comments

What if OpenAI were to first summarize or transform the content before training on it? Then the LLM has never actually seen copyrighted content and couldn't produce an exact copy.

0 comments

1 comments · 1 top-level

bertil2y ago

You are assuming a lossy compression. Stylistic guidelines and personal habits of beat journalists suggest you might not, depending on how detailed the LLM is. The complaint has many quotes that are long verbatim sections.

j / k navigate · click thread line to collapse