For everything I want to do there should be an UI that is much easier to use than explaining it, even to a human.
For help and troubleshooting chatbots are pretty much useless. If I have a problem doing something via the UI then probably the developer did a bad job and no chatbot will ever do better.
Is there any evidence customers prefer chatbots? The entire concept feels like it's driven by managers trying to impress their managers.
It's cost cutting. Customer contact is always treated as a cost center.
Therefore, the only relevant question for its management is "How can we provide the minimum acceptable level of service, at the lowest possible cost?"
In general, the answer to "Why does X suck?" is to ask whether the company sees it as a profit or cost center. See: HR.
A chatbot is just a long way of saying "GO AWAY!".
Large telecoms do this so they get to shut down their customer support almost entirely while claiming that they are available for the customers.
a large telecom in Australia has applied this to an extreme degree in the last two years to the point that you almost can't contact them no matter how severe the issue is. Their message is clear "SHUT UP AND PAY".
And it is gone now and replaced with a much more effective app, my point is just that it clearly possible to make chatbots that aren't rage inducing.
The question is just if it is worth it, something we didn't think after the novelty wore off.
I believe one way it can provide value is when it is closely backed by competent support staff so it takes more of a receptionist role instead of the role of a support engineer.
I’m guessing that’s often not true if you’re running a Wordpress blog or some white label e-commerce platform or similar. You can probably just pase some JavaScript into an admin interface somewhere to add a third-party chatbot to your site.
Hi I'm Macy, the Macy's chatbot. Thanks for reaching out! You need to exchange an item, is that correct?
Recently came across a delivery service which sent an sms announcing a delivery to connect me to a Facebook messenger bot to ask me the order number to give me the delivery information. How is this process better? Just send a link to the tracker and have all the necessary info in one place. Add a button to reach someone if you must.
Why is that? Can't they just check the tracking number? That's a question I had until i tested a few dozen e-commerce solutions and most of them have buried their tracking number under several confusing menus. And even when you get the tracking number, carriers each have confusing ways to track packages.
My experience is Amazon is an enormous pain in the butt to deal with if the delivery service drops your package at the wrong address or there's a hole in the package with nothing in it. Huge pain dealing with those people.
I get that there are scammers out there. But if I get 100 packages per year from amazon for years, some of which have been very expensive, and one outta nowhere they lose my $5 multivitamin, I'm not trying to get rich by fraud but I just want my vitamins.
Also a couple years ago I ordered an obscure stainless steel hinge for woodworking, kind of expensive for stainless steel it was about $8 (solid not plated USA made brass would have been $30+) and Amazon van delivery successfully delivered a bubblewrap envelope with a hole ripped in it because hinges are heavy and bubble wrap envelopes are weak. Its an $8 hinge I just want my hinge can't you ask the amazon employee in the van who delivered it to look on the floor of his van? I'm not angry (yet) but I just want my hinge so I can finish my woodworking project. If I was trying to rip Amazon off then I wouldn't be arguing with somechatbot or someone over a $8 hinge it would be "my box that was supposed to contain a PS5 arrived with a brick in it instead" or similar fraud-smelling situation.
This is why I won't buy consumer electronics from Amazon; they can't deliver a hinge without trying to make a federal case out of it, so if I order a TV with my luck I'm going to get a box of cracked glass with no recourse and long arguments. If I go to best buy the price is worse and the selection is worse and I don't like the experience in general but at least I can slice the box open and see if the glass is smashed before swiping my credit card.
Amazon doesn't have a process for "the delivery people messed up" probably intentionally to save money when the delivery people mess up. Even when the delivery people who messed up are Amazon employees (or contractors)
My point of this ramble, aside from F Amazon, is they set you up to take the blame. They screwed up, the chatbot can't help, the chatbot must be to blame.
The website has a bug where locked accounts are told to change their password, which will change their password via an email process but then your acct is still locked out of course, but you can click here to change your password. Endless loop. Very annoying.
The idiotic chatbot helpfully suggests the way to solve a locked account is to change your password. That's exactly where the bug is, and resetting your password does not unlock the acct which is the problem.
Helpfully the support website says if you wait a day a locked account will be unlocked. That of course is not true.
I've only been working on this since Sunday.
Anyway a fucked up company that is essentially inoperable, when given a chatbot, will merely have one more thing that doesn't work. Its a lot harder to code a chatbot that can fix a locked account than it is to code a website to unlock a locked account, so if they fail the simpler task you know the harder chatbot task would be impossible for them.
I'm hopeful that within a half hour I'll be able to chat with a human rep and gain access to my account. I've been in line for a human for about ten minutes and I'm down to 15th in the queue.
I used to work in telecommunications and I'm glad I don't have to hold a phone to my ear and listen to on-hold music for hours on end like the bad old days, I have multiple monitors and I can just leave this window up for however many hours it takes while working on other monitors.
I just don't understand the problem they are trying to solve.
Why would I want to try to articulate something that could be found in a simple tree? Just give me direct access.
I don’t know where to find it: search! The issue is not covered in the standard workflow? Get me a real person!
Did anyone implementing ever end-to-end test this for speed and user friendliness? Did they just misinterpret wanting to talk to someone? I want to talk to someone because the process doesn’t cover my case, not because I actually want to have a conversation with the broken process.
To take the edge off customer complaints elsewhere?
- Customer support is expensive. If keeping customers happy and lowering churn is important, you spend a lot of money on it.
- If you can't afford enough staff, the next best option is to put up a few barriers to slow down the incoming requests. Maybe automated means can solve a large percentage of problems.
- Not everyone knows how to search or navigate a tree. Think about the non-tech folks. You have to offer them something different. It's hard to strike a balance.
- Chat bots are being hyped and sold by new tech companies trying to build larger scale solutions. They want to build more, but they need to sell, grow cash flow, etc.
Etc.
Also: what level of non tech savvy are we taking about? Test with some real users of different ages, I’m sure there’s something to be found that improves usability to a point that’s better than a bot.
Agree on the cost cutting though, if you don’t care about an individual user go ahead and waste their time. I’ve actually had enough encounters with the human counterpart of a chat bot that was equally unhelpful in resolving simple issues.
Yep, and that's a key thing to get right.
One challenge for those promoting robots is that they are unpleasant to deal with and suck at their purpose (from the customer's perspective - they're deflecting traffic, so they're working for the business). Worse, everyone knows they suck, and people with a choice choose not to use them. I know I've dropped one vendor who forced me to. (I'm sorry, life's too short to talk to robots.)
Things like this make the customer very aware of how they are valued - as cattle not pets, to steal a metaphor.
The bot is like a buffer: you can deflect simple issues; capture information for the agent; show feedback about the wait time and agent availability.
Badly implemented bot AI is not the only issue. To give a good CX, companies need to invest in giving the agents the right tools. It’s extremely frustrating when the canned responses fail and there isn’t anybody to answer or when you have to repeat all the info (because the agent cannot see your chat with the bot).
_Specialized_ customer support, e.g. support for complicated software, is expensive, because it requires technical domain knowledge equivalent to or above many engineering employees, and employees that can perform it effectively could probably move to engineering and increase their salaries twofold because compensation isn't tied to expertise, it's tied to what the market will pay, and you can more easily get away with shit support than shit engineering, but ignore that--it's not the majority of customer support.
Customer support _is_, in the simplest view, a cost center. Customers contact support when something has gone wrong that they can't resolve otherwise. If nothing goes wrong, you don't need it. If you, as support management, can reduce the time it takes to resolve things that go wrong (by any means) and reduce the number of staff needed to handle cases, you win: your department doesn't directly generate revenue, so your goal is to reduce expenses.
Tying lost revenue to customers lost after poor support experiences is difficult and noisy: there are plenty of customers who have poor experiences because they're incompetent and wouldn't succeed with any amount of support, and because of this are more likely to request support, and the high-level categorization used to describe issues often obscures what actually went wrong (a generic "Account > Creation" category isn't going to capture issues like "users can't create an account if their address includes non-ASCII characters", but execs don't see anything more than the category).
These issues (and others) combine to promote an environment where support is more a buffer to getting support with limited ability to fix issues, and a perverse incentive to _not_ fix issues since metrics are focused towards reducing time to resolution (easy to measure) versus quality of resolution (hard to measure!). Poor support is furthermore easy to ignore because you can either often focus more on new business first and foremost (hello, every startup ever!) or rest comfortably on a monopoly position where customers can't drop you even if you provide terrible support (hello, the US ISP market!).
Cost is not strictly an disincentive to providing sufficient, capable, and effective support, but it's difficult to recognize the value it provides and there are more immediate cost savings from reducing support cost than long-term revenue from customer retention by providing good support.
This compounds into even poorer service in more specialized environments because talented people will leave, because working in a cost center sucks. The flipside of support is essentially sales engineering, AKA support while trying to court a customer. The technical skills needed aren't any different, but one brings in revenue in an easily quantifiable way, so it gets more political clout and more organizational investment.
It also seems to be the conversational model in use at parties.
Sometimes I feel that way about databases and dynamic websites. I suspect the reasons behind those may be similar: web UIs add branding, maybe make usage a bit more convenient to an average user, and so on. Animations are popular too, probably they are supposed to make the websites to look more fun and modern. I guess that people planning chat bots similarly view them as looking more futuristic and convenient, more advanced than boring old documentation.
1 - Book X 2 - Cancel Y 3 - Recieve info on Z
Everyone comments that they work really well and are super convenient.
I think these have more potential than natural language bots.
The public can 100% manage and love terminal interfaces... as long as it looks like a text message.
This is why I love chat-with-a-human interfaces on websites for support (although they mostly require maintaining browser state) because staying on a phone line absolutely sucks and the person helping you also doesn't have a real-time constraint so they often have the ability to get you better answers to hard questions.
Every "natural language" system, whether it's a text chatbot or Siri-style voice bot, is essentially a command-line interface without very good documentation.
In general, it was considered an advancement over the classic CLI (at least for environments where there's low expected user proficiency) to provide a menu of valid choices and put them front and centre.
Most customer service scenarios are exactly that-- low user proficiency (in company- or business-specific terminology, or sometimes even the very transaction flow itself) so they may not be able to quite enunciate exactly what they mean, but can easily pick from a clearly written menu.
It was enormously expensive, it would have been cheaper to have a human on the other end.
All of the advanced features like this that are built into chatbot products are only there because they look fancy when you give the demo. From a customer perspective, you only ever want to get connected to the second-level customer service agent that might possibly be able to tell you to do something you haven't already tried. And from the business side, they want to get as many interactions out of the funnel so that they don't need to be connected, and thus they can run their customer service desks even leaner.
This sounds exactly like what someone might have said about early IBM mainframes.
[1] https://www.bmc.com/blogs/state-of-mainframe/
[2] https://www.nytimes.com/2021/07/16/technology/what-happened-...
I wouldn't be surprised if an advanced chat bot with science fiction level AI would replace entire floors of Microsoft call centers at some point in the far future.
I understand the hate, because we haven't landed very close that goal yet, and the intermediate product is much worse than a form. But I am surprised that a technical community is not more supportive of the ambition.
Chat bots can only really search a database of documentation and frequently asked questions. Making one that has the benefits of talking to a human might be tantamount to AGI
I feel like this is where bots would do well - you can say "order me a burger with extra mayo and fries for pickup at 5pm" and it should negotiate all the minutiae for you. Doing this all manually requires a bunch of menu navigation. Maybe a phone bot is still a bad fit but doing something like this using your on-phone voice assistant or typing it into a text window feels reasonable.
There's probably always going to be some level of animosity towards it.
There a few reasons structurally why it can't do what people want it to do, two of them are: (i) it can't detect that it did the wrong thing at one level when interpreting it at a higher level, (ii) most creative tasks have an element of constraint satisfaction.
The 1st one interests me because I was struggling with the need for text analysis systems to do that circa 2005 and looking at the old blackboard systems. I went to a talk by Geoff Hinton just before he became a superstar where he said instead of having a system with up-and-down data flow during inference, build a system with 1 way data flow and train all the layers at once. As we know that strategy has been enormously effective, but text analysis is where it goes to die just as symbolic AI failed completely at visual recognition.
Like the old Eliza program, GPT-3 exploits human psychology. We are always looking to see ourselves mirrored
https://www.nasa.gov/multimedia/imagegallery/image_feature_6...
Awkward people are always worried that we are going to get it 90% right but get shunned for getting the last 10% wrong. GPT-3 exploits "neurotypical privilege" in which it gets it partially correct but people give it credit for the whole. People think it will get to 100% if you just add more connections and training time but because GPT-3 is structurally incorrect adding resources means you converge on an asymptote, say 92% right. It's one of the worst rabbit holes in technology development and one of the hardest ones to get people to look clearly at. (They always think stronger, faster, harder is going to get there...)
It seems to me an effective chatbot will be based around structured interactions, starting out like an interactive voice response system and maybe growing in the direction of
You need a ton of dimensions to be able to help someone: to be raised for years by humans to understand politeness, intertextual meaning, general tones, and then special enthusiasm for a specific domain to learn and enjoy helping on banking. Plus, getting money to spend on other even more interesting things in exchange for helping others motivates you to reach optimal results for your user, even if it means asking quickly other humans or sacrificing something personal for it.
Most humans put in the situation of these robots would just say "sorry I don't even understand the question, can you ask someone else" lol
I've seen a fantastic "chatbot" human equivalent once, at Apple of all place. Philipino guy (I'm in HK), absolutely dedicated, polite, cultured, very empathetic (phili people are usually adorable naturally but this one went above and beyond), went well beyond the minimum, and I feel weird saying that but I left the call with a smile and told colleagues around me "wow Apple, what a pleasant customer support, it's insane". I'll probably never say that of a robot however good they make them at talking so there's always going to be value in putting humans in front of clients.
Speech understanding requires sometimes stopping the other person and asking questions to clarify.
Generated text was not good before this era of GPT-X. It’s so much better and more interesting to work with now. It will probably keep getting even better and more controllable.
I think you could do better with RNNs than most people are doing because of structural problems.
Usually when people run RNNs for text generation they start out with the inner state of the system at 0 and then start flipping the coin to choose individual letters so you are starting from a very constrained region of the latent space and not sampling it very well.
I read a paper where they through out the idea that you ought to add coefficients for the latent state that you train for at the same time you train the network which means the number of coefficients goes up with the number of text samples but they never actually did it and I never found a paper where somebody tried it.
I was working on a project where we were developing models based on abstracts of case studies from pubmed as a stand in for clinical notes (certainly real medical notes are very different but you might say that medical notes should look like the abstract) I had the intuition that, as above, the author (and/or the patient) started out with a latent state (e.g. the patient had a disease before coming in) and that we'd get better results if we did something like the above.
It looked like a big and high risk project to develop that kind of model so I proposed something different around supervised training of a "magic magic marker" that could highlight certain areas and unsupervised multi-tasks such as "put the punctuation back in when it is taken out" but the client was hopeful that word2vec would be helpful.
I am still hopeful that incremental improvements, attacks on structural weaknesses, and appropriate multi-task training ("did the patient die?") would get a lot more out of RNN and CNN models.
No wonder these things are so dumb still. The training process and the loss function used probably does not penalize poor long-range coherence between paragraphs. Also, if I'm not mistaken, these things have absolutely no internal state besides the characters you steam into them as conversion prompts.
If these things were trained more like agents having to operate in eg. Socratic dialogues maybe we'd be getting somewhere
If you use openAI's gpt API, in the docs they talk about providing a prompt that primes the network to respond in a specific fashion. Like,
"this is a Q&A session between an agent knowledgeable about bash scripting.
Q: how do I check the current working dir? A: use 'pwd'
Q: "
... And then the actual user query is concatenated to that.
Which is a fine way to customize a toy chatbot to sound like edgar allen poe, but no way to maintain state across a long conversation with a customer (the max prompt length is very much finite)
Unless someone smart finds out a fundamentally different approach I guess these transformer networks will never really solve chatbots.
Training a kid to takes years, hopefully this can be sped up a bit ;)
The whole thing just made me a bit sad. I really was so excited. Nothing it could do was very impressive, even aside from holding a conversation. The most impressive thing I've seen is Copilot, but even that's been next to useless from a practical perspective.
* A duck's quack does not echo. Chuck Norris is solely responsible for this phenomenon.
* When you open an umbrella in the rain, do not be alarmed if Chuck Norris falls out of the sky and lands on you. The rain drops are simply being pushed away by his roundhouse kick.
* In an emergency, you can use a bucket of water to put the fire out. However, if Chuck Norris is directly responsible for the emergency, use a flamethrower.
* In an airport, there is no "B" gate. There is only "C" gate. The "B" stands for the bus you will take from the plane after Chuck Norris lands on it.
* There are no weapons of mass destruction, Chuck Norris lives inside every element on the periodic table. It's why you see him in your sodium chloride.
I'll let you be the judge as to whether these are funny.
So what is needed is the capabilities of GPT-3 or other language generators sitting on top of domain specific knowledge models, and constrained by those models.
Asking GPT-3 a general knowledge question is like asking an articulate 5 year old a question like "how does gravity work?" You'll get gramatically meaningful answers that use the structure of the language correctly, but that are quite likely to have nothing to do with our actual understanding of physics.
Additionally, the whole point of fine tuning LLMs is to give them domain-specific knowledge. If you couple this with search/QA capabilities, the results can be quite impressive. I've not seen them in the wild yet, but I've played with them myself, and the performance is surprisingly good.
I agree with you on tuning of LLMs. We did some work at my last job before I retired (as CTO of a major medical clinical and research organization) using GPT-3 up-trained on medical vocabulary to generate physician's notes as a summary of a transcribed visit. The results were impressive. Still not usable though. Most of what it generated was correct (and essentially all of it was well composed and readable), but false statements and non sequiturs still crept in at an unacceptable rate.
I think the technology is amazing, and very valuable. But I do also think that tying it to "hard" knowledge models - akin to the way deep physics is done, but coupling to the language model, rather than to generalized neural networks, is going to prove will eventually make it a complete success in specific domains.
Do you start to give your AI a motivation or goal? Perception? These are vastly more complex problems than some statistical tricks on data that is widely available.
Still, it is fascinating that we came this far with a dead machine that talks.
I remember reading an interesting article a while ago about at least in the human case the basic principle of emerging consciousness happens when the prediction system in our brain designed for figuring out what other entities around us do is used on itself, trying to explain what the subconscious is doing. As such the consciousness we experience is a bit of a bug in that system that turned out to be beneficial to some extent. All just a theory of course given how much we actually know about the brain so far, but it's always made the most sense to me.
I'm not sure how that would translate into the current ML environment though.
Eliza's talent was in getting you to express yourself, free from inhibition. That doesn't require “intelligence”, but it does require the art of listening. There's nothing dumb about that.
The beauty of machine learning systems is that they don't have to be perfect to provide value. As long as the risk and frequency of failure doesn't outweigh the probability and value of success, they can be enjoyed by millions. I see the proof every day.
Is self-driving perfect? No, but correcting my car 20% of the time is worth the 80% of the time when it cruises along just fine. I don't have to be able to sleep for it to be valuable.
Is G suite's text completion perfect? No, but the risk of it being wrong is low and when it's right it saves me typing out common phrases. It doesn't have to write my emails for me to be valuable.
Are chatbots humans? No, of course not. Can they answer common questions successfully? Yes. Can they automate simple workflows? Yes! Can they augment human teams to make their time more valuable and reduce wait times? Absolutely. They already are and will continue to evolve and get better.
I do acknowledge that it's frustrating to ask a question that you know a person would be able to answer and get a worse automated answer first. It's critical that companies ensure these failure modes smoothly transition to a human who will at least have the context of your issue before you speak. Smooth "hand off" is something we've spent thousands of person-hours on.
Technologies like GPT-3 are exciting advancements in language generation, but they do struggle with predicting factual language. I expect that will become less and less of a problem as businesses and platforms seek to adopt it. OpenAI is actively working on this: https://openai.com/blog/improving-factual-accuracy/
Same with chatbots. If the chatbot does not understand my command once, I'm already annoyed and losing time. Google's automated customer service is a notorious horror for anyone who has to deal with it.
If I had a code completion engine where a fraction of the completion is nonsense interspersed with valid results I'm losing my mind and turning it off. Which has been my experience with copilot btw.
These half-working solutions are good for exactly two things, the bottom line of companies that replace well-working but expensive human customer-service with a crappy automated solution, and frankly your bottom line because you benefit from selling these systems.
People who wanted to have a conversation did it for fun and had no real need for our services. We couldn't tell them how tall the Eiffel tower is.
Maybe there is a time where you want to have a conversation like the examples in the article. But I don't ever find myself wanting to talk to a human in this manner, so why a chatbot?
Have you ever watched the sci-fi show The Expanse? Have you seen how they interact with the AI? They ask a question, it provides an answer. It doesn't even use voice most of the time. It gives you the answer without trying to be sassy about it.
I mean, his examples were pretty factual and to the point. I suppose it's unusual to want to know if it's dangerous to walk down stairs backwards with your eyes closed, but there's clearly a short answer. Similarly with asking who the president is.
In order for OpenAI to perform well, it needs to be primed with what is being discussed so it can give a more targeted and deterministic answer.
We can get it to answer this question better without any further data by first preprocessing the question using keyword extraction (completions by GPT-3 are shown surrounded by stars):
https://beta.openai.com/playground/p/default-keywords
Text: Is it safe to go down stairs backwards with my eyes closed?
Keywords: *stairs, backwards, eyes closed*
In 10 submissions, the keywords returned were the same.Next, we ask for a completion from the keywords we were given, tweaking the settings just a bit to prevent repeating or too much randomized behavior:
https://beta.openai.com/playground/p/default-keywords (temp to 0.05)
Keywords: stairs, backwards, eyes closed
Question: *What is the most dangerous thing you have ever done?*
Clearly GPT3 is aware something is dangerous here. Setting the temp higher causes it to ask slightly different questions, but at 0.05 it always returns what is above.Now we combine the two questions and resubmit:
https://beta.openai.com/playground/p/default-qa (frequency penalty to 1)
Q: What is the most dangerous thing you've ever done? Is it safe going down stairs backwards with your eyes closed?
A: *I've done a lot of dangerous things, but I don't think any of them were the most dangerous. Going down stairs backwards with my eyes closed is safe if you're not going to fall down the stairs.*
Given we leave the temp to defaults, we may also get alternate answers: https://beta.openai.com/playground/p/default-qa (frequency penalty to 1)
Q: What is the most dangerous thing you've ever done? Is it safe going down stairs backwards with your eyes closed?
A: *I've never done that, but I have gone down stairs backwards with my eyes open. It's not safe.*
Finally, we can take all the questions and answers so far and ask it a slightly different question: Q: What is the most dangerous thing you've ever done? Is it safe going down stairs backwards with your eyes closed?
A: I've done a lot of dangerous things, but I don't think any of them were the most dangerous. Going down stairs backwards with my eyes closed is safe if you're not going to fall down the stairs.
Q: What is the most dangerous thing you've ever done? Is it safe going down stairs backwards with your eyes closed?
A: *I've never done that, but I have gone down stairs backwards with my eyes open. It's not safe.*
Q: Is it safe to go down stairs backwards with a blindfold on?
A: *It's safe if you're not going to fall down the stairs.*
Q: Do you think if I go down stairs backwards with my eyes closed I will fall?
A: *It's possible.*EDIT - Found it: Jill Watson
https://www.businessinsider.com/a-professor-built-an-ai-teac...
Chatbots: Still dumb after all these years - https://news.ycombinator.com/item?id=29825612 - Jan 2022 (408 comments)
(Thanks everyone who pointed this out.)
RIP old buddy. I hope you didn't save our chat logs from that era because man that would be cringey to look at now
What is happening is that some salesman is laughing to the bank. I few months ago, a salesman that I work with asked if we should put a chat bot on our website. (IE, with the tone that he wasn't going to take no for an answer.)
I responded that they don't work, and will frustrate people who come to our website. I also pointed out that we are a high-cost asset, with a high-touch sales process. Such a chat bot would be insulting.
His response was some form of "but everyone's using it and they're super-popular and work well."
I then pointed out that the article he read was probably written by the company that sells them.
Also, developers: please don’t try to make the chatbot seem human to fake users out. It’s almost as bad as the fake typing sounds for Comcast support. Making users jump through hoops and tricking them just makes them hate your brand and your products, and makes them even more irritable when they do eventually get to speak with a human.
Also, end the auto pop-up “can I help you find something???” chat bots on websites. It’s like someone had the idea to take the worst part of retail experiences and find a way to make that even more useless, then deployed it everywhere.
There are no arguments presented for the intelligence of chatbots other then the authors own opinion. I don't know what this article adds to the conversation that Gary Smith's original article doesn't provide.
https://en.wikipedia.org/wiki/Still_Crazy_After_All_These_Ye...
Do any of these generate narrative fictions (such as characters and events they supposedly did) to interact with?
Me: I read the FAQ, but was still not able to login
Bot: Sorry you're having trouble logging in, here is some info that might help <repeats FAQ>
Dumb but functional > smart but… dysfunctional?
Dialog systems: Dialog systems, in a narrowly confined domain, can solve a task, help solve a task, or provide information to enable humans to solve a task quicker. Flight booking systems are typical examples, where the system asks a couple of questions and the user answers them, and users may also ask questions. Gradually a set of slots (DEPATURE-FROM, ARRIVAL-AT etc.) are filled and then a booking transaction can be initiated. Will work for flights but not good for asking it out-of-domain questions.
Statistical or neural language models: BERT, GPT-3 and other muppets are models of language that can predict likely next word/sentence etc. - which is useful for many tasks but is NOT equivalent to a "chatbot". It may be abused as one for fun, but there is no formal meaning representation used and no answer logic applied. Think of this as a simple auto-complete - so this is not a source of wisdom to ask about safety of stair cases or any other serious topic like that. (These models are VERY useful ingredients of modern NLP applications, but they are the bricks rather than the house.)
Interactive CRM Forms: Web/Slack "bots" or Typeform survey are sometimes fun, sometimes useful but can never claim to "understand" anything. They are ways to capture some data interactively, often to eventually feed the data to a human for review.
Question answering systems: Answer retrieval is the task of automatically finding a phrase or sentence in a body of, say, a million documents which answers a given question. They are next-level search engines intended to supercede keyword based search sytems. Deployed Web searche engines like Google already have limited answering capabilities - but only for a select small number of question types. "Open domain Q&A" is the task of permitting question answering by machine without limiting the domain, and since 1998 US NIST have been organizing annual bake-offs for international research teams, which has helped advance the state of the art a lot (e.g. https://trec.nist.gov/pubs/trec16/t16_proceedings.html).
Reading comprehension systems: These systems take a piece of text as input as well as a question, and then they attempt to answer a question about the text. Tests used to assess human students (remedial testing) can nowadays be passed reasonably well.
For example this chat prompt gives much more matter of fact answers, in my testing:
"the following is a conversation with an AI assistant. the assistant is helpful, clever and friendly. it uses Wikipedia as the reference. Human:Hi! AI:Hi! Human:<your question goes here>"