undefined | Better HN

0 pointsChatGTP2y ago0 comments

I still don't understand how you can talk to something that doesn't provide factual information and just take it at face value?

The other day I asked it about the place I live and it made up nonsense, I was trying to get it to help me with an essay and it was just wrong, it was telling me things about this region that weren't real.

Do we just drive through a town, ask for a made up history about it and just be satisfied with whatever is provided?

0 comments

skepticATX2y ago

What LLMs have made me realize more than anything is that we just don't care that much the information we receive being completely factual.

I have tried to use it many times to learn a topic, and my experience has been that it is either frustratingly vague or incorrect.

It's not a tool that I can completely add to my workflow until it is reliable, but I seem to be the odd one out.

bamboozled2y ago

> What LLMs have made me realize more than anything is that we just don't care that much the information we receive being completely factual.

I find this highly concerning but I feel similar.

Even "smart people" I work with seem to have gulped down the LLM cool aid because it's convenient and it's "cool".

Sometimes I honestly think: "just surrender to it all, believe in all the machine tells you unquestionably, forget the fact checking, it feels good to be ignorant... it will be fine...".

I just can't do it though.

huijzer2y ago

> just surrender to it all, believe in all the machine tells you unquestionably, forget the fact checking, it feels good to be ignorant... it will be fine...

It's the same issue with Google Search, any web page, or, heck, any book. Fact checking gets you only so far. You need critical thinking. It's okay to "learn" wrong facts from time to time as long as you are willing to be critical and throw the ideas away if they turn out to be wrong. I think this Popperian view is much more useful than living with the idea that you can only accept information that is provably true. Life is too short to verify every fact. Most things outside programming are not even verifiable anyway. By the time that Steve Jobs would have "verified" that the iPhone was certainly a good idea to pursue, Apple might have been bankrupt. Or in the old days, by the time you have verified that there is a tiger in the bush, it has already eaten you.

2 more replies

steve_adams_862y ago

I just verify the information I need. I find it useful as a sort of search engine for solutions. Like, how could I use generators as hierarchical state machines? Are there other approaches that would work? What are some issues with these solutions? Etc. By the end I have enough information to begin searching the web for comparisons, other solutions, and so on.

The benefit is that I got a quick look at various solutions and quickly satisfied a curiosity, and decided if I’m interested in the concept or not. Without AI, I might just leave the idea alone or spend too much time figuring it out. Or perhaps never quite figure out the terms of what I’m trying to discover, as it’s good at connecting dots when you have an idea with some missing pieces.

I wouldn’t use it for a conversation about things as others are describing. I need a way to verify its output at any time. I find that idea bizarre. Just chatting with a hallucinating machine. Yet I still find it useful as a sort of “idea machine”.

1 more reply

flkenosad2y ago

The smart people I've seen using ChatGPT always double check the facts it gives. However, the truth is that RLHF works well to extinguish these lies over time. As more people use the platform and give feedback, the thing gets better. And now, I find it to be pretty darn accurate.

3 more replies

thfuran2y ago

Ignorance is Strength

uoaei2y ago

I think this post-factual attitude is stronger and more common in some cultures than others. I'm afraid to say but given my extensive travels it appears American culture (and its derivatives in other countries) seems to be spearheading this shift.

bamboozled2y ago

Warning, my opinion ahead:

I think it's because Americans, more than nearly all other cultures, love convenience. It's why the love for driving is so strong in the US. Don't walk or ride, drive.

Once I was walking back from the grocer in Florida with 4 shopping bags, and people pulled over and asked if my car had broken down and if I needed a ride, people were stunned...I was walking for exercise and for the environment...and I was stunned.

More evidence of this trend can be seen in the products and marketing being produced:

Do you need to write a wedding speech? Click here.

Do you need to go get something from the store? get your fat ass in the car and drive, better yet, get a car that drives for you? Better than this, we'll deliver it with a drone...don't move a muscle.

Don't want to do your homework? Here...

Want to produce art? Please enter your prompt...

Want to lose weight? We have a drug for that...

Want to be the authority on some topic? We'll generate the facts you need.

2 more replies

TerrifiedMouse2y ago

> It's not a tool that I can completely add to my workflow until it is reliable, but I seem to be the odd one out.

This. I hate being told the wrong information because I will have to unlearn the wrong information. I would rather have been told nothing.

pbhjpbhj2y ago

ChatGPT 3.5 is terrible on technical subjects IME. Phind is best for me rn. Hugging Chat (Llama) works quite well too.

They're only good on universal truths. An amalgam of laws from around the globe doesn't tell me what the law is in my country, for example.

NikolaNovak2y ago

This is a fairly perpetual discussion, but I'll go for another round:

I feel like using LLM today is like using search 15 years ago - you get a feel for getting results you want.

I'd never use chatGPT for anything that's even remotely obscure, controversial, or niche.

But through all my double-checking, I've had phenomenal success rate in getting useful, readable, valid responses to well-covered / documented topics such as introductory french, introductory music theory, well-covered & non-controversial history and science.

I'd love to see the example you experienced; if I ask chatGPT "tell me about Toronto, Canada", my expectation would be to get high accuracy. If I asked it "Was Hum, Croatia, part of the Istrian liberation movement in the seventies", I'd have far less confidence - it's a leading question, on a less covered topic, introducing inaccuracies in the prompt.

My point is - for a 3 hour drive to cottage, I'm OK with something that's only 95% accurate on easy topics! I'd get no better from my spouse or best friend if they made it on the same drive :). My life will not depend on it, I'll have an educationally good time and miles will pass faster :).

(also, these conversations always seem to end in suffocatingly self-righteous "I don't know how others can live in this post-fact free world of ignorance", but that has a LOT of assumptions and, ironically, non-factual bias in it as well)

TerrifiedMouse2y ago

> I feel like using LLM today is like using search 15 years ago - you get a feel for getting results you want.

I don't think it's quite the same.

With search results, aka web sites, you can compare between them and get a "majority opinion" if you have doubts - it doesn't guarantee correctness but it does improve the odds.

Some sites are also more reputable and reliable than others - e.g. if the information is from Reuters, a university's courseware, official government agencies, ... etc. it's probably correct.

With LLMs you get one answer and that's it - although some like Bard provide alternate drafts but they are all from the same source and can all be hallucinations ...

famouswaffles2y ago

>although some like Bard provide alternate drafts but they are all from the same source and can all be hallucinations ...

Yes and no. If the LLM is repeating the same thing on multiple drafts then it's very unlikely to be a hallucination.

It's when multiple generations are all saying different things that you need to take notice.

LLMs hallucinate yes but getting the same hallucination multiple times is incredibly rare.

2 more replies

doug_durham2y ago

Exactly this! This is my experience also. Your point about "well covered & non-controversial" is spot on. I know not to expect great results when asking about topics that have very little coverage. To be honest I wouldn't expect to go to an arbitrary human and get solid answers on a little covered topic, unless that person just happened to be topic expert. There is so much value in having the basics to intermediate levels of topics covered in a reliable way. That's where most of commercial activity occurs.

jabradoodle2y ago

I think a key difference is that humans very rarely sound convincing talking about subjects they have no clue about.

I've seen the hallucination rate of LLMs improve significantly, if you stick to well covered topics they probably do quite well. The issue is they often have no tells when making things up.

whack2y ago

Joe Rogan has made tons of money off talking without providing factual information. Hollywood has also made tons of money off movies "inspired by real events" that hallucinate key facts relevant to the movie's plot and characters. There's a huge market for infotainment that is "inspired by facts" but doesn't even try to be accurate.

absrec2y ago

You listen to Joe Rogan with the idea that this is a normal dude talking not an expert beyond martial arts and comedy.

A person who uses ChatGPT must have the understanding that it's not like Google search. The layman, however, has no idea that ChatGPT can give coherent incorrect information and treats the information as true.

Most people won't use it for infotainment and OpenAI will try its best to downplay the hallucination as fine print if it goes fully mainstream like google search.

flkenosad2y ago

Give people more credit. If you're using an AI these days, you have to know it hallucinates sometimes. There's even a warning about it when you log in.

3 more replies

bilsbie2y ago

Wait until you learn about the mainstream media.

bamboozled2y ago

For a certain demographic and generation, Joe Rogan is the mainstream media.

1 more reply

flangola72y ago

Rogan is literally the largest podcast on the Spotify. It's the definition of mainstream.

layer82y ago

If that’s your benchmark, I don’t want your AI.

agentultra2y ago

OpenAI isn't marketing ChatGPT as, "infotainment."

jazzyjackson2y ago

now that you mention it, a big "for entertainment purposes only" banner like they use to have on all the psychic commercials on tv would not be inappropriate. it's incredible that LLMs are being marketed as general purpose assistants with a tiny asterisk, "may contain inaccuracies" like it's a walnut contamination

1 more reply

rickspencer32y ago

In my experience, LLVMs are not about being provided facts. They are about synthesizing new content and insights based on the model and inputs.

Rather than asking it about facts, I find it useful to derive new insights.

For example: "Tell me 5 topics about databases that might make it to the front page of hacker news." It can generate an interesting list. That is much more like the example they provided in the article, synthesizing a bed time story is not factual.

Also, "write me some python code to do x" where x is based on libraries that were well documented before 2022 also has similarly creative results in my experience.

xnorswap2y ago

> how you can talk to something that doesn't provide factual information and just take it at face value

Like talking to most people you mean?

philipwhiuk2y ago

When OpenAI buys me a drink at the bar in exchange for the rubbish it produces, I might have a more favourable view.

ilaksh2y ago

As soon as they release the API, we can build an AI "bartender". Combine the voice output and input with NeRF talking heads such as from Diarupt or https://github.com/harlanhong/awesome-talking-head-generatio....

You will now be able to feed it images and responses of the customers. Give it a function to call complementaryDrink(customerId) Combine it with a simple vending machine style robot or something more complex that can mix drinks.

I'm not actually in a hurry to try to replace bartenders. Just saying these types of things immediately become more feasible.

You can also see the possibilities of the speech input and output for "virtual girlfriends". I assume someone at OpenAI must have been tempted to train a model on Scarlett Johansson's voice.

lostcolony2y ago

Hopefully people know not to ask others for factual information (unless it's an area they're actually well educated/knowledgeable in), but for opinions and subjective viewpoints. "How's your day going", "How are you feeling", "What did you think of X", etc, not "So what was the deal with the Hundred Year's War?" or whatever.

If people are treating LLMs like a random stranger and only making small talk, fair enough, but more often they're treating it like an inerrable font of knowledge, and that's concerning.

TeMPOraL2y ago

> If people are treating LLMs like a random stranger and only making small talk, fair enough, but more often they're treating it like an inerrable font of knowledge, and that's concerning.

That's on them. I mean, people need to figure out that LLMs aren't random strangers, they're unfiltered inner voices of random strangers, spouting the first reaction they have to what you say to them.

Anyway, there is a middle ground. I like to ask GPT-4 questions within my area of expertise, because I'm able to instantly and instinctively - read: effortlessly - judge how much to trust any given reply. It's very useful this way, because rating an answer in your own field takes much less work than coming up with it on your own.

graemep2y ago

No individual is "most people". Most of the time I spend talking to people in real life is with people whose professional expertise, hobbies, and other sources of knowledge I know at least roughly. I have an idea how good they are at evaluating what they know and how honest they and whether they are prone to wishful thinking.

bilsbie2y ago

> I still don't understand how you can talk to something that doesn't provide factual information and just take it at face value?

All human interactions from all of history called and they …

mmahemoff2y ago

I'm curious if you're using GPT-4 ($)? I find a lot of the criticisms about hallucination come from users who aren't, and my experience with GPT-4 is it's far less likely to make stuff up. Does it know all the answers, certainly not, but it's self-aware enough to say sorry I don't know instead of making a wild guess.

steve_adams_862y ago

You can also prompt it to hold back if it doesn’t know, which seems to make a difference. It’s part of my default prompt, and since I added it I haven’t had any overt hallucinations. Definitely invalid code, but not due to crazy errors. Just syntax and inconsistent naming mostly.

I verify just about everything that I ask it, so it isn’t just a general sense of improvement.

elicksaur2y ago

Why would anyone pay for something if the free trial doesn’t work? “Hey, you know how we gave you a product that doesn’t quit work as you expect and is super frustrating? Just pay us money, and we’ll give you the same product, but it just works. Just trust us!”

sacred_numbers2y ago

GPT-4 is not the same product. I know it seems like it due to the way they position 3.5 and 4 on the same page, but they are really quite separate things. When I signed up for ChatGPT plus I didn't even bother using 3.5 because I knew it would be inferior. I still have only used it a handful of times. GPT-4 is just so much farther ahead that using 3.5 is just a waste of time.

1 more reply

FooBarWidget2y ago

A human driving buddy can make up a lot of stuff too. Have an interesting conversation but don't take it too seriously. If you're really researching something serious then take a mental note to double check things later, pretend as if you're talking to a semi-reliable human who knows a lot but occasionally makes mistakes.

jocaal2y ago

> ...talk to something that doesn't provide factual information and...

Ah yes, I dont understand how to talk to people either!

bamboozled2y ago

I always thought a better future would be full of more and more distilled, accurate, useful knowledge and truthful people to promote that.

Comments like yours make me think that no one cares about this...and judging by a lot of the other comments, I guess they don't.

Probably going to be people, wading through a sea of AI generated shit, and the individual is supposed to just forever "apply critical thinking" to it all. Even a call from ones spouse could be fake, and you'll just have to apply critical thinking or whatever to workout if you were scammed or not.

astrange2y ago

There aren't any real world sources of truth you can avoid applying critical thinking to. Much published research is false, and when it isn't, you need to know when it's expired or what context it's valid in.

1 more reply

nh23423fefe2y ago

Because it doesn't always make up stuff. Because I'm a human and can ask for more information. I don't want an encyclopedia on a podcast. I want to "talk" to someone about stuff. Not have an enumerated list of truths firehosed at me.

londons_explore2y ago

Pay for the Plus version.

Then it makes stuff up far less frequently.

If the next version has the same step up in performance, I will no longer consider inaccuracy an issue - even the best books have mistakes in them, they just need to be infrequent enough.

nerdbert2y ago

> Pay for the Plus version.

> Then it makes stuff up far less frequently.

Now there's a business model for a ChatGPT-like service.

$1/month: Almost always wrong

$10/month: 50/50 chance of being right or wrong

$100/month: right 95% of the time

TeMPOraL2y ago

You make it sound like business shenanigans, but the truth is, it's a natural fit for now, as performance of LLMs improves with their size, but costs of training (up-front investment) and inference (marginal, per-query) also go up.

md5wasp2y ago

Pay for the $1/month version and invert the responses; now you have the $100/month one for cheap :D

umanwizard2y ago

Are you using 3.5 or 4?

j / k navigate · click thread line to collapse

0 comments

skepticATX2y ago

What LLMs have made me realize more than anything is that we just don't care that much the information we receive being completely factual.

I have tried to use it many times to learn a topic, and my experience has been that it is either frustratingly vague or incorrect.

It's not a tool that I can completely add to my workflow until it is reliable, but I seem to be the odd one out.

bamboozled2y ago

> What LLMs have made me realize more than anything is that we just don't care that much the information we receive being completely factual.

I find this highly concerning but I feel similar.

Even "smart people" I work with seem to have gulped down the LLM cool aid because it's convenient and it's "cool".

Sometimes I honestly think: "just surrender to it all, believe in all the machine tells you unquestionably, forget the fact checking, it feels good to be ignorant... it will be fine...".

I just can't do it though.

huijzer2y ago

> just surrender to it all, believe in all the machine tells you unquestionably, forget the fact checking, it feels good to be ignorant... it will be fine...

2 more replies

steve_adams_862y ago

1 more reply

flkenosad2y ago

3 more replies

thfuran2y ago

Ignorance is Strength

uoaei2y ago

bamboozled2y ago

Warning, my opinion ahead:

I think it's because Americans, more than nearly all other cultures, love convenience. It's why the love for driving is so strong in the US. Don't walk or ride, drive.

More evidence of this trend can be seen in the products and marketing being produced:

Do you need to write a wedding speech? Click here.

Don't want to do your homework? Here...

Want to produce art? Please enter your prompt...

Want to lose weight? We have a drug for that...

Want to be the authority on some topic? We'll generate the facts you need.

2 more replies

TerrifiedMouse2y ago

> It's not a tool that I can completely add to my workflow until it is reliable, but I seem to be the odd one out.

This. I hate being told the wrong information because I will have to unlearn the wrong information. I would rather have been told nothing.

pbhjpbhj2y ago

ChatGPT 3.5 is terrible on technical subjects IME. Phind is best for me rn. Hugging Chat (Llama) works quite well too.

They're only good on universal truths. An amalgam of laws from around the globe doesn't tell me what the law is in my country, for example.

NikolaNovak2y ago

This is a fairly perpetual discussion, but I'll go for another round:

I feel like using LLM today is like using search 15 years ago - you get a feel for getting results you want.

I'd never use chatGPT for anything that's even remotely obscure, controversial, or niche.

TerrifiedMouse2y ago

> I feel like using LLM today is like using search 15 years ago - you get a feel for getting results you want.

I don't think it's quite the same.

With search results, aka web sites, you can compare between them and get a "majority opinion" if you have doubts - it doesn't guarantee correctness but it does improve the odds.

Some sites are also more reputable and reliable than others - e.g. if the information is from Reuters, a university's courseware, official government agencies, ... etc. it's probably correct.

With LLMs you get one answer and that's it - although some like Bard provide alternate drafts but they are all from the same source and can all be hallucinations ...

famouswaffles2y ago

>although some like Bard provide alternate drafts but they are all from the same source and can all be hallucinations ...

Yes and no. If the LLM is repeating the same thing on multiple drafts then it's very unlikely to be a hallucination.

It's when multiple generations are all saying different things that you need to take notice.

LLMs hallucinate yes but getting the same hallucination multiple times is incredibly rare.

2 more replies

doug_durham2y ago

jabradoodle2y ago

I think a key difference is that humans very rarely sound convincing talking about subjects they have no clue about.

I've seen the hallucination rate of LLMs improve significantly, if you stick to well covered topics they probably do quite well. The issue is they often have no tells when making things up.

whack2y ago

absrec2y ago

You listen to Joe Rogan with the idea that this is a normal dude talking not an expert beyond martial arts and comedy.

Most people won't use it for infotainment and OpenAI will try its best to downplay the hallucination as fine print if it goes fully mainstream like google search.

flkenosad2y ago

Give people more credit. If you're using an AI these days, you have to know it hallucinates sometimes. There's even a warning about it when you log in.

3 more replies

bilsbie2y ago

Wait until you learn about the mainstream media.

bamboozled2y ago

For a certain demographic and generation, Joe Rogan is the mainstream media.

1 more reply

flangola72y ago

Rogan is literally the largest podcast on the Spotify. It's the definition of mainstream.

layer82y ago

If that’s your benchmark, I don’t want your AI.

agentultra2y ago

OpenAI isn't marketing ChatGPT as, "infotainment."

jazzyjackson2y ago

1 more reply

rickspencer32y ago

In my experience, LLVMs are not about being provided facts. They are about synthesizing new content and insights based on the model and inputs.

Rather than asking it about facts, I find it useful to derive new insights.

Also, "write me some python code to do x" where x is based on libraries that were well documented before 2022 also has similarly creative results in my experience.

xnorswap2y ago

> how you can talk to something that doesn't provide factual information and just take it at face value

Like talking to most people you mean?

philipwhiuk2y ago

When OpenAI buys me a drink at the bar in exchange for the rubbish it produces, I might have a more favourable view.

ilaksh2y ago

I'm not actually in a hurry to try to replace bartenders. Just saying these types of things immediately become more feasible.

You can also see the possibilities of the speech input and output for "virtual girlfriends". I assume someone at OpenAI must have been tempted to train a model on Scarlett Johansson's voice.

lostcolony2y ago

If people are treating LLMs like a random stranger and only making small talk, fair enough, but more often they're treating it like an inerrable font of knowledge, and that's concerning.

TeMPOraL2y ago

> If people are treating LLMs like a random stranger and only making small talk, fair enough, but more often they're treating it like an inerrable font of knowledge, and that's concerning.

graemep2y ago

bilsbie2y ago

> I still don't understand how you can talk to something that doesn't provide factual information and just take it at face value?

All human interactions from all of history called and they …

mmahemoff2y ago

steve_adams_862y ago

I verify just about everything that I ask it, so it isn’t just a general sense of improvement.

elicksaur2y ago

sacred_numbers2y ago

1 more reply

FooBarWidget2y ago

jocaal2y ago

> ...talk to something that doesn't provide factual information and...

Ah yes, I dont understand how to talk to people either!

bamboozled2y ago

I always thought a better future would be full of more and more distilled, accurate, useful knowledge and truthful people to promote that.

Comments like yours make me think that no one cares about this...and judging by a lot of the other comments, I guess they don't.

astrange2y ago

1 more reply

nh23423fefe2y ago

londons_explore2y ago

Pay for the Plus version.

Then it makes stuff up far less frequently.

If the next version has the same step up in performance, I will no longer consider inaccuracy an issue - even the best books have mistakes in them, they just need to be infrequent enough.

nerdbert2y ago

> Pay for the Plus version.

> Then it makes stuff up far less frequently.

Now there's a business model for a ChatGPT-like service.

$1/month: Almost always wrong

$10/month: 50/50 chance of being right or wrong

$100/month: right 95% of the time

TeMPOraL2y ago

md5wasp2y ago

Pay for the $1/month version and invert the responses; now you have the $100/month one for cheap :D

umanwizard2y ago

Are you using 3.5 or 4?

j / k navigate · click thread line to collapse