undefined | Better HN

0 pointscdme2y ago0 comments

Google surfaces data — or it used to — LLMs and AI companies actively exploit it with zero benefit given to creators or users of the platforms they're now cannibalizing.

0 comments

spxneo2y ago

the irony. im surprised how businesses built on selling google search results is allowed to exist. i guess for the same reason google scraping the internet and building a product on top of it is allowed.

then it only makes sense scraped AI training data is also going to be tolerated because you would need to reproduce a large language model like ChatGPT using your copyrighted content can produce a similar derivative of your copyrighted content by doing forensic analysis.

its such an uphill battle for copyright holders. They need to replicate: copyrighted input ---> LM similar to ChatGPT4 ---> copyrighted output

So far its not looking good for OpenAI because its possible to generate copyrighted output (type spiderman in czech) so all that remains is demonstrating the middle layer (training it on LM similar to ChatGPT4) but that is unrealistically expensive.

I have theory that all this money spent on large models is to make it impossible for discovery (as it would require access to $100 billion GPUs)

cdmeOP2y ago

The whole notion that AI can replace search is nonsense. It yields no benefit to the creators of the results it scrapes and the models hallucinate. It's worse for users and it's worse for everyone producing anything of note online.

spxneo2y ago

but many chatgpt users are not using Google as much instead relying on LLMs + RAG

ChatGPT is the new search engine and provides far more value to the end user than Google.

The issue seems to be people want a payout from OpenAI...but its non-profit

cdmeOP2y ago

It's a shiny toy — it'll yield worse answers. Much like Google's own AI.

1 more reply

j / k navigate · click thread line to collapse

0 comments

spxneo2y ago

its such an uphill battle for copyright holders. They need to replicate: copyrighted input ---> LM similar to ChatGPT4 ---> copyrighted output

I have theory that all this money spent on large models is to make it impossible for discovery (as it would require access to $100 billion GPUs)

cdmeOP2y ago

spxneo2y ago

but many chatgpt users are not using Google as much instead relying on LLMs + RAG

ChatGPT is the new search engine and provides far more value to the end user than Google.

The issue seems to be people want a payout from OpenAI...but its non-profit

cdmeOP2y ago

It's a shiny toy — it'll yield worse answers. Much like Google's own AI.

1 more reply

j / k navigate · click thread line to collapse