The other day I asked it about the place I live and it made up nonsense, I was trying to get it to help me with an essay and it was just wrong, it was telling me things about this region that weren't real.
Do we just drive through a town, ask for a made up history about it and just be satisfied with whatever is provided?
I have tried to use it many times to learn a topic, and my experience has been that it is either frustratingly vague or incorrect.
It's not a tool that I can completely add to my workflow until it is reliable, but I seem to be the odd one out.
I find this highly concerning but I feel similar.
Even "smart people" I work with seem to have gulped down the LLM cool aid because it's convenient and it's "cool".
Sometimes I honestly think: "just surrender to it all, believe in all the machine tells you unquestionably, forget the fact checking, it feels good to be ignorant... it will be fine...".
I just can't do it though.
It's the same issue with Google Search, any web page, or, heck, any book. Fact checking gets you only so far. You need critical thinking. It's okay to "learn" wrong facts from time to time as long as you are willing to be critical and throw the ideas away if they turn out to be wrong. I think this Popperian view is much more useful than living with the idea that you can only accept information that is provably true. Life is too short to verify every fact. Most things outside programming are not even verifiable anyway. By the time that Steve Jobs would have "verified" that the iPhone was certainly a good idea to pursue, Apple might have been bankrupt. Or in the old days, by the time you have verified that there is a tiger in the bush, it has already eaten you.
The benefit is that I got a quick look at various solutions and quickly satisfied a curiosity, and decided if I’m interested in the concept or not. Without AI, I might just leave the idea alone or spend too much time figuring it out. Or perhaps never quite figure out the terms of what I’m trying to discover, as it’s good at connecting dots when you have an idea with some missing pieces.
I wouldn’t use it for a conversation about things as others are describing. I need a way to verify its output at any time. I find that idea bizarre. Just chatting with a hallucinating machine. Yet I still find it useful as a sort of “idea machine”.
I think it's because Americans, more than nearly all other cultures, love convenience. It's why the love for driving is so strong in the US. Don't walk or ride, drive.
Once I was walking back from the grocer in Florida with 4 shopping bags, and people pulled over and asked if my car had broken down and if I needed a ride, people were stunned...I was walking for exercise and for the environment...and I was stunned.
More evidence of this trend can be seen in the products and marketing being produced:
Do you need to write a wedding speech? Click here.
Do you need to go get something from the store? get your fat ass in the car and drive, better yet, get a car that drives for you? Better than this, we'll deliver it with a drone...don't move a muscle.
Don't want to do your homework? Here...
Want to produce art? Please enter your prompt...
Want to lose weight? We have a drug for that...
Want to be the authority on some topic? We'll generate the facts you need.
This. I hate being told the wrong information because I will have to unlearn the wrong information. I would rather have been told nothing.
They're only good on universal truths. An amalgam of laws from around the globe doesn't tell me what the law is in my country, for example.
I feel like using LLM today is like using search 15 years ago - you get a feel for getting results you want.
I'd never use chatGPT for anything that's even remotely obscure, controversial, or niche.
But through all my double-checking, I've had phenomenal success rate in getting useful, readable, valid responses to well-covered / documented topics such as introductory french, introductory music theory, well-covered & non-controversial history and science.
I'd love to see the example you experienced; if I ask chatGPT "tell me about Toronto, Canada", my expectation would be to get high accuracy. If I asked it "Was Hum, Croatia, part of the Istrian liberation movement in the seventies", I'd have far less confidence - it's a leading question, on a less covered topic, introducing inaccuracies in the prompt.
My point is - for a 3 hour drive to cottage, I'm OK with something that's only 95% accurate on easy topics! I'd get no better from my spouse or best friend if they made it on the same drive :). My life will not depend on it, I'll have an educationally good time and miles will pass faster :).
(also, these conversations always seem to end in suffocatingly self-righteous "I don't know how others can live in this post-fact free world of ignorance", but that has a LOT of assumptions and, ironically, non-factual bias in it as well)
I don't think it's quite the same.
With search results, aka web sites, you can compare between them and get a "majority opinion" if you have doubts - it doesn't guarantee correctness but it does improve the odds.
Some sites are also more reputable and reliable than others - e.g. if the information is from Reuters, a university's courseware, official government agencies, ... etc. it's probably correct.
With LLMs you get one answer and that's it - although some like Bard provide alternate drafts but they are all from the same source and can all be hallucinations ...
Yes and no. If the LLM is repeating the same thing on multiple drafts then it's very unlikely to be a hallucination.
It's when multiple generations are all saying different things that you need to take notice.
LLMs hallucinate yes but getting the same hallucination multiple times is incredibly rare.
I've seen the hallucination rate of LLMs improve significantly, if you stick to well covered topics they probably do quite well. The issue is they often have no tells when making things up.
A person who uses ChatGPT must have the understanding that it's not like Google search. The layman, however, has no idea that ChatGPT can give coherent incorrect information and treats the information as true.
Most people won't use it for infotainment and OpenAI will try its best to downplay the hallucination as fine print if it goes fully mainstream like google search.
Rather than asking it about facts, I find it useful to derive new insights.
For example: "Tell me 5 topics about databases that might make it to the front page of hacker news." It can generate an interesting list. That is much more like the example they provided in the article, synthesizing a bed time story is not factual.
Also, "write me some python code to do x" where x is based on libraries that were well documented before 2022 also has similarly creative results in my experience.
Like talking to most people you mean?
You will now be able to feed it images and responses of the customers. Give it a function to call complementaryDrink(customerId) Combine it with a simple vending machine style robot or something more complex that can mix drinks.
I'm not actually in a hurry to try to replace bartenders. Just saying these types of things immediately become more feasible.
You can also see the possibilities of the speech input and output for "virtual girlfriends". I assume someone at OpenAI must have been tempted to train a model on Scarlett Johansson's voice.
If people are treating LLMs like a random stranger and only making small talk, fair enough, but more often they're treating it like an inerrable font of knowledge, and that's concerning.
That's on them. I mean, people need to figure out that LLMs aren't random strangers, they're unfiltered inner voices of random strangers, spouting the first reaction they have to what you say to them.
Anyway, there is a middle ground. I like to ask GPT-4 questions within my area of expertise, because I'm able to instantly and instinctively - read: effortlessly - judge how much to trust any given reply. It's very useful this way, because rating an answer in your own field takes much less work than coming up with it on your own.
All human interactions from all of history called and they …
I verify just about everything that I ask it, so it isn’t just a general sense of improvement.
Ah yes, I dont understand how to talk to people either!
Comments like yours make me think that no one cares about this...and judging by a lot of the other comments, I guess they don't.
Probably going to be people, wading through a sea of AI generated shit, and the individual is supposed to just forever "apply critical thinking" to it all. Even a call from ones spouse could be fake, and you'll just have to apply critical thinking or whatever to workout if you were scammed or not.
Then it makes stuff up far less frequently.
If the next version has the same step up in performance, I will no longer consider inaccuracy an issue - even the best books have mistakes in them, they just need to be infrequent enough.
> Then it makes stuff up far less frequently.
Now there's a business model for a ChatGPT-like service.
$1/month: Almost always wrong
$10/month: 50/50 chance of being right or wrong
$100/month: right 95% of the time