undefined | Better HN

0 pointsRandallBrown1y ago0 comments

Isn't that because ChatGPT is trained on those QA platforms? If real humans aren't answering questions on the Internet, how will LLMs learn the answers to those questions?

0 comments

15 comments · 4 top-level

Legend24401y ago· 4 in thread

Not entirely. LLMs are smart enough to answer questions from documentation or from other sources, even if the exact question wasn't asked anywhere.

But Q&A websites do contain information that might not be in other sources, so there would be some loss.

diamond5591y ago

That's just copying from better sources, doesn't mean the ai is "smarter".

jncfhnb1y ago

True. But AI does not close your question because it didn’t like your phrasing.

1 more reply

atonse1y ago

AI can also answer any question in your own language.

Imagine all of stack overflow seamlessly translated to, say, Thai or Vietnamese.

gradus_ad1y ago

How about "more capable"?

layer81y ago· 4 in thread

While true, this doesn’t change the problem that Q&A platforms have a hard time competing with LLMs, and I don’t see how that is likely to change.

bluefirebrand1y ago

It is frankly absurd that they should be expected to

These LLMs could not exist without them, but now they're expected to compete?

If all of the Q&A platforms die off, how are LLM training datasets going to get new information?

This whole AI boom is typical corporate shortsightedness imo. Kill the future in order to have a great next quarter

I hope I'm wrong. If I am right, then I hope we figure this out before AI has bulldozed everything into dust

bloomingkales1y ago

If all of the Q&A platforms die off, how are LLM training datasets going to get new information?

You just take arbitrary data and ask the LLM to put it in Q&A format and generate the synthetic training data. Unless you are suggesting Quora is the source of new information, which I don't agree with.

Quora does not care about the user experience. Their obsession with pay-walling killed the site for me across a decade. They literally could not get me to sign up and boy did they try (I really needed an answer once too!). My soul really remembers hostile sites.

pipes1y ago

In my experience, they do seem to be very good at synthesizing answers from docs. However I don't know if that will work for edge cases which is one of the things SO is good at.

2 more replies

chii1y ago

> It is frankly absurd that they should be expected to

> These LLMs could not exist without them, but now they're expected to compete?

Yea, those damn tractor makers - they ate the food that the hand farmers used to make! How are hand farmers expected to compete with tractors now, when it's so much more efficient and can do 100x the work!?

1 more reply

shagie1y ago· 3 in thread

Q&A tends to be "chunky" and asynchronous in its communication model.

This comes from a reaction to the previous model of forums where it was smaller bits of data spread across multiple comments or posts. I recall going through forums in the days before Stack Overflow, trying to find out how to solve a problem. https://xkcd.com/979/ was very real.

Stack Overflow (and its siblings) was an attempt to change this to a "one spot that has all the information".

That model works, but it is a high maintenance approach. Trying to move from a back and forth of information that can only be understood in its entirety across a conversation to become one that more closely resembles a Wikipedia page (that hides all of the work of Talk:Something). The key thing is it takes a lot of work to maintain that Q&A format.

And yet, users often don't know what they want. They want that forum model with interaction and step by step hand holding by someone who knows the answer. Stack Overflow was intentionally designed to make that approach difficult in an attempt to make the Q&A the easier solution on the site.

ChatGPT provides the users who want the step by step hand holding an infinitely patient thing behind the screen that doesn't embarrass them in public and is confident that it knows the answer to their problem.

Stack Overflow and Quora and other Q&A forums are the abomination. People want Perlmonks https://www.perlmonks.org/?node_id=11164039 and /r/JavaHelp where its interacting with another and small steps rather than Q&A.

---

The future of "well, if people stop using the sites that is generating the information that is being used to train the models that people are using to get information" ... that becomes an interesting problem.

I am reminded of Accelerando ( https://www.antipope.org/charlie/blog-static/fiction/acceler... ) and the digital civilizations being various forms of scams and the currency is things that can think new ideas.

The currency is new material that is to be sold. The information gets locked behind some measures to try to make scraping impractical and then sold off wholesale. Humans still talk and answer questions. There are new posts on Reddit about how to solve problems even while ChatGPT is out there. And Reddit is presumably trying to make harvesting the content within its walls something that others have to pay for to get at for training.

wombatpm1y ago

See what’s missing is the ability to put your answers on the blockchain …

aleph_minus_one1y ago

> And yet, users often don't know what they want. They want that forum model with interaction and step by step hand holding by someone who knows the answer.

This sounds like the seed for a business model to pitch in the next upcoming hype cycle: "short-term hiring of experts with quarter-hourly billing increments enabled by our web- or app-based user interface". :-)

shagie1y ago

Maybe... the difficulty with how AI is advancing, it might be difficult to distinguish it from an LLM. You might even get to the callcenter problem of "initial support handled by someone following a script" before you can get passed to someone who knows how to solve the problem.

If I want someone to walk me through making muffins from scratch, is a human on the other and of that line (competing with $1/day rates for ChatGPT Pro - its cheaper than that, but that's the comparison) and are they better than what ChatGPT can do?

It would have to be... quite a bit more than what the LLM would be priced at. The minimum it could reasonably be (without any other things) would get close to $4/15m... and that's minimum wage.

I really don't think that humans are competitive on that timescale or rate.

It would probably be better to hire people at some higher rate to write content for your private model. Brandon Sanderson is considered one of the faster writers (in the fantasy genre) and averages at about 2500 words / day ( https://famouswritingroutines.com/collections/daily-word-cou... ) - and while he makes a lot more than most authors, lets go to a more typical $75,000 USD / year. 250 working days per year and we're at $300 / day. And we're to $0.12 per word. ... Which puts a person in the intermediate to experienced price per word range https://uxwritinghub.com/writers-salary/

Not that I'm suggesting that's the way to do it, but something for LLMs to consider - hire experts to write content for their LLM. $125 per 1000 word blog post.

298 words. I'd like my $37.25 please. Not that I'm asking you for that, but rather that's what my words as training material would be worth.

65101y ago

Make new q&a websites.

j / k navigate · click thread line to collapse

0 comments

15 comments · 4 top-level

Legend24401y ago· 4 in thread

Not entirely. LLMs are smart enough to answer questions from documentation or from other sources, even if the exact question wasn't asked anywhere.

But Q&A websites do contain information that might not be in other sources, so there would be some loss.

diamond5591y ago

That's just copying from better sources, doesn't mean the ai is "smarter".

jncfhnb1y ago

True. But AI does not close your question because it didn’t like your phrasing.

1 more reply

atonse1y ago

AI can also answer any question in your own language.

Imagine all of stack overflow seamlessly translated to, say, Thai or Vietnamese.

gradus_ad1y ago

How about "more capable"?

layer81y ago· 4 in thread

While true, this doesn’t change the problem that Q&A platforms have a hard time competing with LLMs, and I don’t see how that is likely to change.

bluefirebrand1y ago

It is frankly absurd that they should be expected to

These LLMs could not exist without them, but now they're expected to compete?

If all of the Q&A platforms die off, how are LLM training datasets going to get new information?

This whole AI boom is typical corporate shortsightedness imo. Kill the future in order to have a great next quarter

I hope I'm wrong. If I am right, then I hope we figure this out before AI has bulldozed everything into dust

bloomingkales1y ago

If all of the Q&A platforms die off, how are LLM training datasets going to get new information?

pipes1y ago

In my experience, they do seem to be very good at synthesizing answers from docs. However I don't know if that will work for edge cases which is one of the things SO is good at.

2 more replies

chii1y ago

> It is frankly absurd that they should be expected to

> These LLMs could not exist without them, but now they're expected to compete?

1 more reply

shagie1y ago· 3 in thread

Q&A tends to be "chunky" and asynchronous in its communication model.

Stack Overflow (and its siblings) was an attempt to change this to a "one spot that has all the information".

---

wombatpm1y ago

See what’s missing is the ability to put your answers on the blockchain …

aleph_minus_one1y ago

> And yet, users often don't know what they want. They want that forum model with interaction and step by step hand holding by someone who knows the answer.

shagie1y ago

It would have to be... quite a bit more than what the LLM would be priced at. The minimum it could reasonably be (without any other things) would get close to $4/15m... and that's minimum wage.

I really don't think that humans are competitive on that timescale or rate.

Not that I'm suggesting that's the way to do it, but something for LLMs to consider - hire experts to write content for their LLM. $125 per 1000 word blog post.

298 words. I'd like my $37.25 please. Not that I'm asking you for that, but rather that's what my words as training material would be worth.

65101y ago

Make new q&a websites.

j / k navigate · click thread line to collapse