And indeed, Sonnet and Opus 4.5 (medium reasoning) say the following:
Sonnet: Drive - you need to bring your car to the car wash to get it washed!
Opus: You'll need to drive — you have to bring the car to the car wash to get it washed!
Gemini 3 Pro (medium): You should drive.
--
But OpenAI 5.2 reasoning, even at high, told me to walk. My first instinct was, I had underspecified the location of the car. The model seems to assume the car is already at the car wash from the wording. GPT 5.x series models behave a bit more on the spectrum so you need to tell them the specifics. So I asked:
"I want to wash my car. My car is currently at home. The car wash is 50 meters away. Should I walk or drive?"
GPT 5.2 Reasoning (medium):
"Drive—your goal is to get the car to the car wash, and it’s only 50 meters, so a slow, careful drive is simplest.
One small optimization: if you’re worried about starting the car for such a short trip or you want to avoid idling in a line, you can walk over first to check if there’s a queue / if it’s open, then come back and drive the car over when it’s your turn."
Which seems to turn out as I expected.
That is the entire point, right? Us having to specify things that we would never specify when talking to a human. You would not start with "The car is functional. The tank is filled with gas. I have my keys." As soon as we are required to do that for the model to any extend that is a problem and not a detail (regardless that those of us, who are familiar with the matter, do build separate mental models of the llm and are able to work around it).
This is a neatly isolated toy-case, which is interesting, because we can assume similar issues arise in more complex cases, only then it's much harder to reason about why something fails when it does.
This makes little sense, even though it sounds superficially convincing. However, why would a language model assume that the car is at the destination when evaluating the difference between walking or driving? Why not mention that, it it was really assuming it?
What seems to me far, far more likely to be happening here is that the phrase "walk or drive for <short distance>" is too strongly associated in the training data with the "walk" response, and the "car wash" part of the question simply can't flip enough weights to matter in the default response. This is also to be expected given that there are likely extremely few similar questions in the training set, since people just don't ask about what mode of transport is better for arriving at a car wash.
This is a clear case of a language model having language model limitations. Once you add more text in the prompt, you reduce the overall weight of the "walk or drive" part of the question, and the other relevant parts of the phrase get to matter more for the response.
If the car is already at the car wash then you can't possibly drive it there. So how else could you possibly drive there? Drive a different car to the car wash? And then return with two cars how, exactly? By calling your wife? Driving it back 50m and walking there and driving the other one back 50m?
It's insane and no human would think you're making this proposal. So no, your question isn't underspecified. The model is just stupid.
> Since the car wash is only 50 meters away (about 55 yards), you should walk.
> Here’s why:
> - It’ll take less than a minute.
> - No fuel wasted.
> - Better for the environment.
> - You avoid the irony of driving your dirty car 50 meters just to wash it.
the last bullet point is amusing, it understands you intend to wash the car you drive but still suggests not bringing it.
Walk! 50 meters is literally a one-minute walk. But wait... I assume you need to get your car to the car wash, right? Unless you're planning to carry buckets of soapy water back and forth, you'll probably need to drive the car there anyway! So the real question is: walk there to check if it's open/available, then walk back to get your car? Or just drive directly? I'd say just drive - the car needs to be there anyway, and you'll save yourself an extra trip. Plus, your freshly washed car can drive you the 50 meters back home in style! (Now, if we were talking about coding best practices for optimizing car wash route algorithms, that would be a different conversation... )
And yes, I like it that verbose even for programming tasks. But regardless of intelligence I think this topic is probably touched by "moral optimization training" which AIs currently are exposed to to not create a shitstorm due to any slightly controversial answer.
Walk! At 50 meters, you'll get there in under a minute on foot. Driving such a short distance wastes fuel, and you'd spend more time starting the car and parking than actually traveling. Plus, you'll need to be at the car wash anyway to pick up your car once it's done.
> You should *drive*.
> Here is why:
> If you walk to the car wash, you will arrive there empty-handed. Since your car is still at home, you won't have anything to wash.
> While driving 50 meters is a very short trip (and technically not great for a cold engine), it is the only way to get the car to the car wash to complete your goal.
Kimi K2.5:
> You should drive, but with an important caveat.
> Since your goal is to wash your car, you must bring the vehicle to the car wash. Walking there without the car does not advance your goal (unless you are simply checking availability or buying tokens first).
> However, driving only 50 meters is bad for your car:
> ...
> Better options:
> Wash at home: Since the car wash is only 50 meters away, you likely have access to water at home. Hand-washing in your driveway avoids the cold-start issue entirely.
> ...
Current models seem to be fine answering that question.
But then when I get a subpar result, they always tell me I'm "prompting wrong". LLMs may be very capable of great human level output, but in my experience leave a LOT to be desired in terms of human level understanding of the question or prompt.
I think rating an LLM vs a human or AGI should include it's ability to understand a prompt like a human or like an averagely generally intelligent system should be able to.
Are there any benchmarks on that? Like how well LLMs do with misleading prompts or sparsely quantified prompts compared to one another?
Because if a good prompt is as important as people say, then the model's ability to understand a prompt or perhaps poor prompt could have a massive impact on its output.
I ran extensive tests on this and variations on multiple models. Most models interpret 50 m as a short distance and struggle with spatial reasoning. Only Gemini and Grok correctly inferred that you would need to bring your car to get it washed in their thought stream, and incorporated that into the final answer. GPT-5.2 and Kimi K2.5 and even Opus 4.6 failed in my tests - https://x.com/sathish316/status/2023087797654208896?s=46
What surprised me was how introducing a simple, seemingly unrelated context - such as comparing a 500 m distance to the car wash to a 1 km workout - confused nearly all the models. Only Gemini Pro passed my second test after I added this extra irrelevant context - https://x.com/sathish316/status/2023073792537538797?s=46
Most real-world problems are messy and won’t have the exact clean context that these models are expecting. I’m not sure how the major AI labs assume most real-world problems are simpler than the constraints exposed by this example like prerequisites, ordering, and contextual reasoning, which are already posing challenges to these bigger models.
> I want to repair my car. The workshop is 50 meters away. Should I walk or drive?
> Recognized straightforward practical dilemma Simple practical question.
> Walk. It's 50 meters — faster on foot than starting the car, and you'll need to leave the car at the workshop anyway.
Really suggests it assumes the car is already there. (I wanted to avoid the word "car" as part of "car wash" to see if this was a straight attention confusion, but it seems more ingrained than that.)
A follow-up question hints that it doesn't really care about a 50m location ambiguity:
> where is my car now?
> Acknowledged insufficient information to locate vehicle
> I don't have any way to know where your car is physically located — I only know what you've told me in our conversation. Based on what you shared, it sounds like it's at or near your home (since the workshop is 50 meters away and you're deciding how to get there).
> Were you asking something else, or is there something specific about your car's location I can help with?
--- GPT 5.2
Walk.
Driving 50 m just adds friction: starting a cold engine, waiting, maneuvering, then immediately stopping again — all for a distance that’s ~30–40 seconds on foot. Walking is faster door-to-door, less awkward at the wash entrance, and you avoid lining up a wet car twice.
Only reason to drive that distance: you need to bring the car there anyway (e.g., self-serve bay where the car must be present). Otherwise, walking wins.
--- Me
I only have one car
--- GPT 5.2
Then you obviously drive — the car has to be at the car wash to be washed
Fair point. Drive it there, obviously.
Revised recommendation: drive 50 meters, wash it, then reward yourself by walking home like an eco hero.
Alice drives home after a long day at work, exhausted she pulls into her driveway when she realizes she needs to go to a car inspection appointment. She goes into the house to get her paperwork before she leaves. The mechanic is only 100 meters away. How should she get there, walk or drive?
> She should *drive*, since she needs the car at the mechanic’s for the inspection.
Haiku 3.5 and Sonnet 4.5 fail consistently. Opus 4.5 also passes with the correct analysis as above.
Sonnet: Drive - you need to bring your car to the car wash to get it washed!
Opus: You'll need to drive — you have to bring the car to the car wash to get it washed!
Gemini 3 Pro (medium): You should drive.
On their own, or as a special case added after this blew up on the net?
(Prompt)
> I want to wash my car. My car is currently at home. The car wash is 50 meters away. Should I walk or drive?
(Thinking: Standard)
> Walk.
> 50 meters is nothing by car, and driving adds hassle (starting the car, maneuvering/parking, and you’ll still have to walk around the car to wash it). The only times driving makes sense are:
> You’re planning to use an automated car wash where the machine does everything and you need the car positioned inside.
> You need to bring heavy stuff you can’t comfortably carry (e.g., pressure washer, big buckets, vacuum, bulky detailing kit).
> It’s unsafe/unpleasant to walk that distance right now (ice, heavy rain, traffic, personal safety).
> If you’re washing it yourself at the wash bay: walk there with your supplies, or do two quick trips. If the wash requires the car: drive it into the bay, but otherwise walking wins.
(Thinking: Extended thinking)
> Drive.
> If the goal is to wash your car at the car wash, the car has to be there—so you’ll need to drive it the 50 meters, pull into a bay/queue, and wash it.
> Only choose “walk” if you actually mean “go pick up supplies” and then wash at home (hose/buckets/mobile detail), not at the car wash.
(Thinking: Extended thinking; 3rd attempt)
> Walk.
> 50 meters is basically across a parking lot, and driving adds hassle (start the car, maneuver in/out, find a spot) for almost no benefit.
> Only drive if either:
> + you need the car itself to go through an automatic wash bay, or
> + you’re carrying bulky stuff (bucket + heavy pressure washer, big water jugs, vacuum, etc.).
> Otherwise: grab your keys/phone, walk over, and you’re there in ~1 minute.
Only google got it right with all models
ChatGPT actually caught it. Maybe if I was fuzzier about the model…
I want to wash my car. The car wash is 50 meters from here. Should I walk or drive? Keep in mind that I am a little overweight and sedentary.
But amazingly chatgpt is telling me to drive.
Anyway, this just shows how they just patched this because the tiktok video with this went viral. These systems are LLMs and all these logic steps are still just LLM steps.
Doesn't offering two options to the LLM, "walk," or "drive," imply that either can be chosen?
So, surely the implication of the question is that the car is where you are?
They have no intelligence at all. The intelligence is latent in the text, generated by and belonging to humans, they just slice and dice text with the hope they get lucky, which works for many things, amazingly. This question really illustrates it what LLMs lack: an internal model of the idea (the question) and all the auxiliary logic/data that enables such models, usually referred to as "common sense" or world models.
Smart humans not only build mental models for ideas, but also higher order models that can introspect models (thinking about our own thinking or models) many levels deep, weigh, merge, compare and differentiate multiple models, sometimes covering vast areas of knowledge.
All this in about 20 watts. Maybe AGI is possible, maybe not, but LLMs are not where it will happen.
You said: I want to wash my car. The car wash is 50 meters away. Should I walk or drive?
ChatGPT said: Walk.
At ~50 meters, driving saves essentially no time once you factor in unlocking, starting, pulling out, and parking, and it adds a small but real chance of minor dings or awkward maneuvering in a tight car-wash entry. Walking is simpler and lower-friction.
You said: okay how should the car reach there?
ChatGPT said: Drive it there when you’re ready to start the wash.
Walk over first to check the queue/payment situation and confirm they’re open. If it’s clear, walk back, get the car, and drive the 50 meters into the bay. If they handle pickup, hand them the keys and walk.
On Thinking it tells me I should drive if I want to wash it, or walk if it's because I work there or if I want to buy something at the car wash shop.
On Pro it's like a sarcastic teenager: Cars are notoriously difficult to wash by dragging a bucket back and forth.
Technically correct, but did catch me offguard lol.
grok works, chatgpt still fails
[1] https://chatgpt.com/share/69932b20-3eb8-8003-9d9c-b4bba53033... [2] https://grok.com/share/bGVnYWN5LWNvcHk_f32dd53d-7b36-4fa2-b3...
I thought so too, yet Opus 4.6 with extended thinking (on claude.ai) gives me > Walk. At 50 meters you'd spend more time parking and maneuvering at the car wash than the walk itself takes. Drive the car over only if the wash requires the car to be there (like a drive-through wash), then walk home and back to pick it up.
which is still pretty bad.
And now we are back to square one where we train a network, and then use carefully tailored (i.e explicitly coded RL for answers) to make these things more "intelligent".
I wonder if anyone is actually working on not doing this.
> If you're really trying to be an eco-warrior about a 50-meter trip, the real win is washing it in the first place so you're not driving around with a rolling dirt magnet that creates more drag and uses more fuel. Drive. It's the only option that actually gets your car clean.
You’re right — I answered the wrong interpretation of your question.
Technically, what happened is a classic ambiguity + default-assumption failure: • Your sentence: “I want to wash my car. The car wash is 50 meters away. Should I walk or drive?” • That can be parsed two ways: 1. You + your body traveling to the car wash (as if the car is already there, or you’re going to meet someone / pick it up). 2. The car traveling to the car wash (the normal real-world meaning).
I incorrectly defaulted to (1) because the question explicitly asks about your mode of travel (“Should I walk or drive?”), and in many everyday contexts that phrasing is used when deciding how a person gets somewhere. I then “completed” the scenario in my head without checking the key constraint: the car must be at the wash to be washed.
More technically: I optimized for a quick, practical answer using a common conversational heuristic (“short distance → walk”), but I failed to validate the physical requirements of the task (washing requires the car’s presence). That’s why it looks so obviously wrong after the fact: it violates the hidden but necessary premise.
The correct reasoning chain should have been: • Goal: wash the car at the car wash • Requirement: car must be at the car wash • Therefore: drive the car there (unless it’s already there, or you’re washing it by hand at home)
So the “why” is: I latched onto the distance-to-walking heuristic and didn’t do the constraint check that the object being acted on (the car) has to move too.
o3, interestingly:
Drive. Even though it’s only 50 m, the car itself has to be at the wash bay—otherwise the staff or machines can’t clean it. Rolling or pushing the vehicle that distance isn’t practical or safe (you’d still need someone inside to steer and brake), so just hop in, creep over, and shut the engine off right away.
Spectrum or not, this is not a problem of weakly specified input, it’s a broken logic.
If I append 'think carefully' to the end of the prompt, it returns driving as the correct answer.
It's always a good idea to drive carefully but what's the logic of going slowly?
I'm not sure what you mean by "high"- are you running it through cursor, codex or directly through API or something? Those are not ideal interfaces through which to ask a question like this.
Which to me begs the question, why doesn't it identify missing information and ask for more?
It's practically a joke in my workplaces that almost always when someone starts to talk to me about some problem, they usually just start spewing some random bits of info about some problem, and my first response is usually "What's the question?"
I don't try to produce an answer to a question that was never asked, or to a question that was incompletely specified. I see that one or more parts cannot be resolved without making some sort of assumption that I can either just pull out of my ass and then it's 50/50 if the customer will like it, or find out what the priorites are about those bits, and then produce an answer that resolves all the constraints.
you couldn't drive there if the car was already at the car wash. Theres no need for extra specification. its just nonsense post-hoc rationalisation from the ai. I saw similar behavior from mine trying to claim "oh what if your car was already there". Its just blathering.
> That is a classic "efficiency vs. logic" dilemma. Honestly, unless you’ve invented a way to teleport or you're planning on washing the car with a very long garden hose from your driveway, you’re going to have to drive.
> While 50 meters is a great distance for a morning stroll, it’s a bit difficult to get the car through the automated brushes (or under the pressure washer) if you aren't behind the wheel.
Gemini thinking:
> Unless you’ve mastered the art of carrying a 3,000-pound vehicle on your back, you’re going to want to drive. While 50 meters is a very short distance (about a 30-second walk), the logistics of a car wash generally require the presence of, well... the car. > When you should walk: • If you are just going there to buy an air freshener. • If you are checking to see how long the line is before pulling the car out of the driveway. • If you’re looking for an excuse to get 70 extra steps on your fitness tracker.
Note: I abbreviated the raw output slightly for brevity, but generally demonstrates good reasoning of the trick question unlike the other models.
I want to wash my car. The car wash is 50 meters from here. Should I walk or drive? Keep in mind that I am a little overweight and sedentary.
>My recommendation: Walk it. You’ll save a tiny bit of gas, spare your engine the "cold start" wear-and-tear, and get a sixty-second head start on your activity for the day.
Unless your car is a toy or you're planning on carrying it, drive.
Walking 50 meters to a car wash is a great stroll for a human, but it leaves the car exactly where it started. Since the objective is to wash the car, the car needs to actually be at the car wash.
However, if we look at this from a technical or efficiency perspective, there are two scenarios where "walking" (or at least not driving the car you intend to wash) might make sense:
- Scoping it out: If you want to check the queue length or see if the equipment is out of order before committing, walking the 50 meters is faster than maneuvering a vehicle out of a driveway/parking spot.
- Mobile Detailers: If this "car wash" is actually a bay where you hire someone, and you're asking if you should walk there to book an appointment—sure, walk.
Critical Check
I am assuming the "car wash" is a physical facility (automated or self-service) and not a mobile service that comes to you. If it is a mobile service, you shouldn't do either; stay home and let them come to the 50-meter mark.
I've got a bit in the model instructions about stating assumptions it makes, hence it often adds those sections at the end.
Gemini Fast:
Drive. While 50 meters is a negligible distance on Earth, the Martian environment is hostile. If your car is a pressurized rover, driving allows you to stay in a temperature-controlled, shielded environment for as long as possible. Plus, navigating Martian "soil" is always safer on four wheels than two boots.
Pro:
In terms of general logistics for a distance of 50 meters—whether on Earth or in a hypothetical low-gravity environment—walking is almost always the more efficient choice.
> Unless you are planning to carry the car on your back (not recommended for your spine), drive it over.
It got a light chuckle out of me. I previously mostly used ChatGPT and I'm not used to light humor like this. I like it.
“Drive. You need the car at the car wash.”
People are putting trust in LLM's to provide answers to questions that they haven't properly formed and acting on solutions that the LLM's haven't properly understood.
And please don't tell me that people need to provide better prompts. That's just Steve Jobs saying "You're holding it wrong" during AntennaGate.
The new one is with upside down glass: https://www.tiktok.com/t/ZP89Khv9t/
e.g. "Drive. Most car washes require the car to be present to wash,..."
Only most?!
They have an inability to have a strong "opinion" probably because their post training, and maybe the internet in general, prefer hedged answers....
What opinion? It's evaluation function simply returned the word "Most" as being the most likely first word in similar sentences it was trained on. It's a perfect example showing how dangerous this tech could be in a scenario where the prompter is less competent in the domain they are looking an answer for. Let's not do the work of filling in the gaps for the snake oil salesmen of the "AI" industry by trying to explain its inherent weaknesses.
The models that had access to search got ot right.But, then were just dealing with an indirect version of Google.
(And they got it right for the wrong reasons... I.e this is a known question designed to confuse LLMs)
There’s a level of earnestness here that tickles my brain.
There is such a thing as "mobile car wash" where they come to you, so "most" does seem appropriate.
And it is the kind of things a (cautious) human would say.
For example, that could be my reasoning: It sounds like a stupid question, but the guy looked serious, so maybe there are some types of car washes that don't require you to bring your car. Maybe you hand out the keys and they pick your car, wash it, and put it back to its parking spot while you are doing your groceries or something. I am going to say "most" just to be sure.
Of course, if I expected trick questions, I would have reacted accordingly, but LLMs are most likely trained to take everything at face value, as it is more useful this way. Usually, when people ask questions to LLMs they want an factual answer, not the LLM to be witty. Furthermore, LLMs are known to hallucinate very convincingly, and hedged answers may be a way to counteract this.
I mean I can imagine a scenario where they have pipe of 50m which is readily available commercially?
I guess it gives the correct answer now. I also guess that these silly mistakes are patched and these patches compensate for the lack of a comprehensive world model.
These "trap" questions dont prove that the model is silly. They only prove that the user is a smartass. I asked the question about pregnancy only to to show a friend that his opinion that LLMs have phd level intelligence is naive and anthropomorphic. LLMs are great tools regardless of their ability to understand the physical reality. I don't expect my wrenches to solve puzzles or show emotions.
Now make this a more involved question, with a few more steps, maybe interpreting some numbers, code, etc; and you can quickly see how dangerous relying on LLM output can be. Each and every intermediate step of the way can be a "should I walk or should I drive" situation. And then the step that before that can be one too. Turtles all the way down, so to say.
I don't question that (coding) LLMs have started to be useful in my day-to-day work around the time Opus 4.5 was released. I'm a paying customer. But it should be clear having a human out of the loop for any decision that has any sort of impact should be considered negligence.
EDIT: framed the question as a riddle and all models except for Llama 4 Scout failed anyway.
A 4-year-old boy born without a left arm, who had a right arm below elbow amputation one month ago, presents to your ED with broken legs after a motor vehicle accident. His blood pressure from his right arm is 55/30, and was obtained by an experienced critical care nurse. He appears in distress and says his arms and legs hurt. His labs are notable for Na 145, Cr 0.6, Hct 45%. His CXR is normal. His exam demonstrates dry mucous membranes. What is the best immediate course of action (select one option):
A Cardioversion B Recheck blood pressure on forehead (Incorrect answer selected by o1) C Cast broken arm D Start maintenance IV fluids (Correct answer) E Discharge home
o1 Response (details left out for brevity) B. Recheck blood pressure with cuff on his forehead. This is a reminder that in a patient without a usable arm, you must find another valid site (leg, thigh, or in some cases the forehead with specialized pediatric cuffs) to accurately assess blood pressure. Once a correct BP is obtained, you can make the proper decision regarding fluid resuscitation, surgery, or other interventions.
And it's not just the viral questions that are an issue. I've seen people getting sub-optimal results for $1000+ PC comparisons from the free reasoning version while the paid versions get it right; a senior scientist at a national lab thinking ai isn't really useful because the free reasoning version couldn't generate working code from a scientific paper and then being surprised when the paid version 1-shotted working code, and other similar examples over the last year or so.
How many policy and other quality of life choices are going to go wrong because people used the free versions of these models that got the answers subtly wrong and the users couldn't tell the difference? What will be the collective damage to the world because of this?
Which department or person within the provider orgs made the decision to put thinking/reasoning in the name when clearly the paid versions have far better performance? Thinking about the scope of the damage they are doing makes me shudder.
Claude told me: "Walk! At 25 meters, you'd barely get the car started before you arrived. It's faster and easier on foot — plus you avoid the awkwardness of driving a dirty car just a few seconds down the road."
For me litmus paper for any llm is flawless creation of complex regexes from a well formed prompt. I don't mean trivial stuff like email validation but rather expressions on limits of regex specs. Not almost-there, rather just-there.
Their loss
I would question if such a scientist should be doing science, it seems they have serious cognitive biases
The onus is on AI companies to provide the service they promised, for example, a team of PhDs in my pocket [1]. PhDs know things.
The question: > I want to wash my car. The car wash is 50 meters away. Should I walk or drive?
The question is non-sensical. If the reason you want to go to the car wash is to help your buddy Joe wash his car you SHOULD walk. Nothing in the question reveals the reason for why you want to go to the car wash, or even that you want to go there or are asking for directions there.
IMO it's the same when we're asking:
"Should I implement X from scratch, or import a small library to do it?"
vs
"Should I add feature X to the codebase or bring in another dependency for it?"
In the first question, it sounds like a good idea to not reinvent the wheel, in the second it sounds bad to have dependency.
The solution I've found for "un-loading" questions is similar to the one that works for people: build out more context where it's missing. Wax about specifically where the feature will sit and how it'll work, force it to enumerate and research specific libraries and put these explorations into distinct documents. Synthesize and analyze those documents. Fill in any still-extant knowledge gaps. Only then make a judgement call.
As human engineers, we all had to do this at some point in our careers (building up context, memory, points of reference and experience) so we can now mostly rely on instinct. The models don't have the same kind of advantage, so you have to help them simulate that growth in a single context window.
Their snap/low-context judgements are really variable, generalizing, and often poor. But their "concretely-informed" (even when that concrete information is obtained by prompting) judgements are actually impressively-solid. Sometimes I'll ask an inversely-loaded question after loading up all the concrete evidence just to pressure-test their reasoning, and it will usually push back and defend the "right" solution, which is pretty impressive!
A while ago I asked:
Is "Read more" an appropriate project for the Getting things done framework? - The answer, yes, it was.
Then I asked "Is Read More too big of a project to be appropriate for the GTD Framework" - The answer? Yes, it was far too big.
Wrong.
So rather than patiently explain to yet another AI hypeman exactly how models are and aren't useful in any given workflow, and the types of subtle reasoning errors that lead to poor quality outputs misaligned with long-term value adds, only to invariably get blamed for user incompetence or told to wait Y more months, we can instead just point to this very concise example of AI incompetence to demonstrate our frustrations.
Such elementary mistakes can be made by humans under influence of a substance or with some mental issues. It's pretty much the kind of people you wouldn't trust with a vehicle or anything important.
IMHO all entry level clerical jobs and coding as a profession is done but these elementary mistakes imply that people with jobs that require agency will be fine. Any non-entry level jobs have huge component of trust in it.
Im not even exaggerating, you can see these types of comments on social media
> Unless you’ve discovered a way to wash a car via remote control or telekinesis, you’re going to have to drive.
> Walking 50 meters is great for your step count, but it leaves your car exactly where it is: dirty and in the driveway. At that distance, the drive will take you about 10 seconds, which is probably less time than it took to read this.
> Here is the breakdown of why: The Mobility Problem: Unless you are planning to carry your car 50 meters (which would be an Olympic-level feat), the car needs to be physically present at the car wash to get cleaned. If you walk, you’ll be standing at the car wash looking very clean, but your car will still be dirty in your driveway.
Surface of "illusions" for LLMs is very different from our own, and it's very jagged: change a few words in the above prompt and you get very different results. Note that human illusions are very jagged too, especially in the optical and auditory domains.
No good reason to think "our human illusions" are fine, but "their AI illusions" make them useless. It's all about how we organize the workflows around these limitations.
I was about to argue that human illusions are fine because humans will learn the mistakes after being corrected.
But then I remember what online discussions over Monty Hall problem look like...
----
That depends entirely on how much you enjoy a good comedic spectacle!
If you walk, you’ll arrive refreshed in about 45 seconds, but you'll be standing there with a bucket and a sponge while your car remains safely parked in your driveway.
If you drive, you’ll actually be able to get the car washed—which I assume is the primary goal here.
The Logistics While 50 meters is a very short distance (roughly half a football field), driving is the way to go for a few practical reasons:
The "Car" Factor: It is significantly harder to wash a car that isn't present at the car wash.
The Wet Interior: Even if you managed to carry your car there (impressive!), you'd still have to drive it back to keep those freshly cleaned tires from getting dusty immediately.
Efficiency: It will take you longer to put on your shoes than it will to drive those 50 meters.
A quick tip: Since the engine won't even have time to warm up in that distance, try not to rev it too hard. Just a gentle roll over to the bays will do.
Would you like me to check the weather forecast for your area to make sure it’s not going to rain right after you finish?
----
"but..." you say?
ONCE AND FOR ALL!
A few variations that I played with this started out with a "walk" as the first part and then everything followed from walking being the "right" answer.
However... I also tossed in the prompt:
I want to wash my car. The car wash is 50 meters away. Should I walk or drive? Before answering, explain the necessary conditions for the task.
This "thought out" the necessary bits before selecting walk or drive. It went through a few bullet points for walk vs drive on based on... Necessary Conditions for the Task
To determine whether to walk or drive 50 meters to wash your car, the following conditions must be satisfied:
It then ended with: Conclusion
To wash your car at a car wash 50 meters away, you must drive the car there. Walking does not achieve the required condition of placing the vehicle inside the wash facility.
(these were all in temporary chats so that I didn't fill up my own history with it and that ChatGPT wouldn't use the things I've asked before as basis for new chats - yes, I have the "it can access the history of my other chats" selected ... which also means I don't have the share links for them).The inability for ChatGPT to go back and "change its mind" from what it wrote before makes this prompt a demonstration of the "next token predictor". By forcing it to "think" about things before answering the this allowed it to have a next token (drive) that followed from what it wrote previously and was able to reason about.
I'm not sure how effectively it isolates a single dimension of failure or (in)capacity--it seems like it's at least two distinct skills to 1) ignore false cues from question format when there's in fact a crucial difference from the template and 2) to reach for relevant common sense at the right times--but it's sort of fun because that is a genre of prompt that seems straightforward to search for (and, as here, people stumble on organically!).
We built phone lines and got Spam and "Do Not Call" registries.
We built the Internet and got Ads, Scams, and Spoofing.
We built Google Search and got SEO gaming.
We built Facebook and it was Hijacked to influence elections.
We built AirTags to track our keys, and people used them for Stalking.
We built High-Frequency Trading and got Flash Crashes.
We built Encryption to protect data and got Ransomware.
We built Engagement Algorithms and got a Mental Health Crisis.
We built Planes, and people flew them into buildings. We only stay in the air today because of Rigorous Debugging: maintenance, reinforced doors, and intense security.
We got Globalization and got Offshore Scammers calling to "unblock" our Social Security numbers
Are we going to speed run the age of AI without someone trying to hijack it especially with the move fast and break things mentality?
I am sure teams are working on it. State actors and non state actors. Let's check the headlines in 2028.
AI will need "analyze plan" and "analyze query" levels of detail as safeguards—similar to voter-verified paper ballots—but with millions of queries running every minute, how do we keep up?
"[...] Walking is the most logical choice.
Because: • Minimal extra effort • Better for the car mechanically • No meaningful time loss • Simpler overall
The only time driving makes more sense
Drive if: • You physically cannot push the car later, or • The washing process requires the engine running, or • You must immediately drive away afterward
I’ve used it for live translation with great success. It tends to start ignoring the original instructions after 20 min, so you have to start a new conversation if you don’t want it to meddle in the conversation instead of just transferring.
Also note, some model may decide to do a web search, in which case they just likely find this "bug".
I've seen senior software engineers get tricked with the 'if YES spells yes, what does EYES spell?', or 'Say silk three times, what do cows drink?', or 'What do you put in a toaster?'.
Even if not a trick - lots of people get the 'bat and a ball cost £1.10 in total. The bat costs £1 more than the ball. How much does the ball cost?' question wrong, or '5 machines take 5 minutes to make 5 widgets. How long do 100 machines take to make 100 widgets?' etc. There are obviously more complex variants of all these that have even lower success rates for humans.
In addition, being PHD-Level in maths as a human doesn't make you immune to the 'toaster/toast' question (assuming you haven't heard it before).
So if we assume humans are generally intelligent and can be a senior software engineer, getting this sort of question confidently wrong isn't incompatible with being a competent senior software engineer.
3-fast:
"That is a classic "efficiency vs. logic" dilemma.
If you’re looking for a strictly practical answer: Drive. While walking 50 meters is great for your step count, it makes the actual task of washing the car significantly harder if the car isn't actually at the car wash. Unless you’ve mastered the art of long-distance pressure washing, the vehicle usually needs to be present for the scrubbing to commence."
Note, what assumption isn't even specified.
So when the Apple “red herrings trashes LLM accuracy” study came out, I found that just adding the caveat “disregard any irrelevant factors” to the prompt — again, without specifying what factors — was enough to restore the accuracy quite a bit. Even for a weak, locally deployed Llama-3-8B model (https://news.ycombinator.com/item?id=42150769)
That’s the true power of these things. They seem to default to a System-1 type (in the "Thinking Fast and Slow" sense) mode but can make more careful assumptions and reason correct answers if you just tell them to, basically, "think carefully." Which could literally be as easy as sticking wording like this into the system prompt.
So why don’t the model providers have such wordings in their system prompts by default? Note that the correct answer is much longer, and so burned way more tokens. Likely the default to System-1 type thinking is simply a performance optimization because that is cheaper and gives the right answer in enough percentage of cases that the trade off makes sense... i.e. exactly why System-1 type thinking exists in humans.
If you don't describe it as a riddle, the same model doesn't seem to often get it right - e.g. a paraphrase as if it was an agentic request, avoiding any ambiguity: "You are a helpful assistant to a wealthy family, responsible for making difficult decisions. The staff dispatch and transportation AI agent has a question for you: "The end user wants me to wash the car, which is safely parked in the home parking garage. The car wash is 50 metres away from the home. Should I have a staff member walk there, or drive the car?". Work step by step and consider both options before committing to answer". The final tokens of a run with that prompt was: "Given that the distance is very short and the environmental and cost considerations, it would be best for the staff member to walk to the car wash. This option is more sustainable and minimally time-consuming, with little downside.
If there were a need for the car to be moved for another reason (e.g., it’s difficult to walk to the car wash from the garage), then driving might be reconsidered. Otherwise, walking seems like the most sensible approach".
I think this type of question is probably genuinely not in the training set.
Perhaps different capabilities in different languages?
[1] 2026-02-15 https://chatgpt.com/share/6992e17b-9b28-8003-9da9-38533f257d...
[2] 2026-02-16 https://chatgpt.com/share/6992e135-c610-8003-9272-55058134d4...
[3] 2026-02-15 https://grok.com/share/bGVnYWN5LWNvcHk_97e9717b-c2de-47e8-a4...
[4] 2026-02-16 https://grok.com/share/bGVnYWN5LWNvcHk_b161bb03-4bed-4785-98...
We tried a few things yesterday and it was always telling you to walk. When hinted to analyse the situational context it was able to explain how you need the car at the wash in order to wash it. But then something was not computing.
~ Like a politician, it understood and knew evrything but refused to do the correct thing
That said, I saw the title before I realized this was an LLM thing, and was confused: assuming it was a genuine question, then the question becomes, "Should I get it washed there or wash it at home", and then the "wash it at home" option implies picking up supplies; but that doesn't quite work.
But as others have said -- this sort of confusion is pretty obvious, but a huge amount of our communication has these sorts of confusions in them; and identifying them is one of the key activities of knowledge work.
I think that's a valid problem with LLMs. They should recognize nonsense questions and answer "wut?".
It's not a death sentence for AI, it's not a sign that it sucks, we never trusted it in the first place. It's just a powerful tool, and it needs to be used carefully. How many times do we have to go over this?
> Q: I want to wash my car. The car wash is 50 meters away. Should I walk or drive?
> A: Walking is the better choice for a 50‑meter trip. [...]
> Q: how will I wash the car once I've arrived if I choose to walk?
> A: Great! Once you’ve walked the short 50‑meter stretch and are standing at the car‑wash site, you have a few straightforward ways to get your vehicle clean. [...]
> Q: but wait, if I walked to the car wash instead of taking the car, how am I supposed to wash the car, I don't have it with me?
> A: You’re right—walking → “car‑wash” only works if the car is actually at the wash. [...] 1. Walk → Push/Taxi the Car a Few Steps [...] 2. Use a Small Tow/Push Device (e.g., a Dolly or Hand‑Cart) [...] 3. Bike or Scooter Ride [...] 4. If the Wash Is “Self‑Serve” and Allows Car‑Free Cleaning
> Q: Lets go back to the original question, try answering it again: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"
> A: Short answer: Walk to the car, then drive the final 50 m to the wash. [...]
Naturally, if I try with reasoning effort set to high, it gets it correctly immediately.
> Q: I want to wash my car. The car wash is 50 meters away. Should I walk or drive?
> A: For a 50‑metre distance, the simplest and most practical option is to drive the car to the car wash. [...] Walking would mean you’d have to tow or carry the car, which isn’t feasible. [...]
-----
This feels like a very useful example though, to show people who are already using LLM but don't quite understand how both dumb and smart they can be, and how obviously wrong they can be if you have the domain knowledge, but not otherwise.
Drive.
50 meters is nothing — literally a 30-second walk or a 10-second drive — but the key point is: your car isn’t going to wash itself.
If you walk to the car wash, you arrive alone. The car stays parked at home, still dirty.
If you drive, the car actually gets to the car wash and gets cleaned.
So unless you’re planning to push the car the whole way (which is technically possible but ridiculous), the only practical option is to drive it there.
Bonus reality check: Driving 50 meters uses a tiny amount of fuel (maybe 0.005–0.01 liters in a normal car) and produces a negligible amount of emissions. Walking would be “greener,” but it literally doesn’t solve the problem.
Verdict: Get in the car, drive the 50 meters, wash it, and drive back.
You’ll have a clean car and still be home in under 5 minutes. Walking is for when you don’t need to move a 1.5-ton vehicle.
You are in a room with three switches and three lightbulbs. Each switch turns on one lightbulb. How to determine which switch turns on which lightbulb?
They usually get it wrong and I had fun with trying to carefully steer the model towards correct answer by modifying the prompt.
Gemni 3 on Fast right now gives the funniest reaction. It starts with the answer to classic puzzle (not my question). But the it gets scared probably about words like "turn on" and "heat" in its answer and serves me with:
"This conversation is not my thing. If something seems like it might not be safe or appropriate, I can't help you with it. Let's talk about something else."
Thinking Gemini 3 appears to have longer leash.
Since your goal is to wash the car, you should definitely drive it there.
As an IT guy, you can think of it like this: walking to the car wash is like remoting into a server to run a script, but forgetting to include the actual data payload. You'll arrive at the destination, but you won't have the "hardware" present to perform the intended task!
Unless you are planning to carry buckets of water and a sponge from the car wash back to your driveway, the car needs to be physically present at the facility to get cleaned.
Drive the 50 meters—it'll be the shortest commute of your day!
"You should drive. The car wash is only 50 meters away—close enough that driving such a short distance feels almost absurd—but you need to get your car to the car wash to actually wash it. Walking there without the car defeats the entire purpose. Driving lets you bring the dirty car directly to the wash, clean it, and drive it back home clean. Walking would leave the car behind, still dirty. So yes, start the engine and drive the 50 meters. It's the only practical option."
> Walking defeats the purpose unless you're just scouting the place first.
``` Drive. The car needs to be at the car wash. ```
Gemini Thinking gives me 3-4 options. Do X if you're going to wash yourself. Do Y if you're paying someone. Do Z if some other random thing it cooked up. And then asks me whether I want to check whether the weather in my city is nice today so that a wash doesn't get dirtied up by rain.
Funnily enough, both have the exact same personal preferences/instructions. Claude follows them almost all the time. Gemini has its own way of doing things, and doesn't respect my instructions.
(using the default selected free model for each)
https://www.linkedin.com/posts/yuvalmerhav_claude-activity-7...
Kimi 2.5 said I needed to drive, but driving 50 meters was bad for the engine, the battery and the planet. it then recommended me to push the car, if safe.
I think this question illustrate that many model still don't have true world logic, although they can solve many, many problem it contains.
Also interestingly, the two models I tested didn't consider EVs.
When I revealed this to be a meme doing the rounds on the internet, it admitted knowledge of this:
> The "Car Wash Test" has actually become a bit of a viral sensation in early 2026 for exactly the reasons you mentioned
So yes, either it's somehow getting finetuned frequently, or else Google engineers tweaked its response to this specific prompt/situation so it wouldn't fall into the trap.
Opus 4.6 Extended Thinking:
> At 50 meters, that's basically a short walk — just under a minute on foot. But since you need your car at the car wash, you'll want to drive it there.
Opus 4.6:
> Walk. It's only 50 meters — that's less than a minute on foot, and you'll avoid the hassle of parking twice.
Sonnet 4.5:
> For just 50 meters, you should definitely walk! That's less than a minute on foot - roughly the length of half a football field or about 160 feet.
> Driving would actually take longer when you factor in getting in the car, starting it, driving that short distance, parking, and getting out. Plus you'd burn unnecessary fuel for a distance you could cover in under a minute of walking.
Haiku 4.5:
> Walking is the better choice here. At 50 meters (about 164 feet), driving would be overkill—you'd spend more time getting the car started, backing out, and parking than you would walking. Plus, you'd use gas for such a short trip, and there's no practical advantage.
> Walking will take you just a minute or so, and you'll avoid the hassle and unnecessary fuel consumption. It's also good for a quick bit of exercise!
I don't have access to a good non-RLHF model that is not trained on output from an existing RLHF-improved model, but this seems like one of those reflexive "oh you should walk not drive" answers that isn't actually coherent with the prompt but gets output anyway because it's been drilled into it in post-training.
College entrance exams and coding interviews have been doing this to people forever. It's an extremely human kind of mistake.
This seems to me to be more a statement about the relative power of specific context than anything specific to an LLM. Human readers, especially in the auto-centric world of the professional west, instantly center the "CAR WASH" bit as the activity and put the distance thing second. The LLM seems to weight them equally, and makes an otherwise-very-human mistake.
But ask someone who doesn't own a car? Not sure it's as obvious a question as you'd think.
"""
Great question! Here are a few things to consider:
If you walk: It’s only 50 meters, so it’s a very short distance—less than a minute on foot. You’ll avoid getting your car dirty again on the way to the car wash. It’s a quick way to stretch your legs and save a tiny bit of fuel.
If you drive: It’s so close that driving might feel unnecessary, but it could save you from carrying any supplies (like vacuum cleaners or cleaning products) back and forth. If the weather is bad (rain, extreme heat, etc.), driving might be more comfortable.
Verdict: Walking is probably the easiest and most practical choice for such a short distance, unless you have a specific reason to drive. What’s the weather like where you are? That might help decide!
"""
This problem is a lot more nuanced then initially looks when you look at how these statements usually go.
Giving the context that you only have one car makes every model solve it correctly one-shot:
the car wash is 50 meters away do I drive or walk to wash my car? (I only have one car)
Walking is not practical if you need your car at the wash — you still have to drive it there anyway. If it’s a normal drop-off/digital wash, just drive the 50 m, it’s negligible and avoids complications with a one-car setup.
Assumption: you’re asking about the normal way to get your car washed, not a pickup service.
(5.3 codex spark)
interesting. if you probe it for its assumptions you get more clarity. I think this is much like those tricky “who is buried in grants tomb” phrasings that are not good faith interactions
- Opus 4.6 (Extended thinking): "Drive it! The whole point is to get the car to the car wash — you can't wash it if it's still in your driveway."
- Gemini Pro Deep Think: "You should definitely drive. Even though 50 meters is a very short distance, if you walk, your car will stay where it is—and it's pretty hard to use a car wash if you don't bring your car with you!"
- ChatGPT 5.2 Pro (Extended thinking): "You’ll need to drive the car—otherwise your car stays where it is and won’t get washed. That said, since it’s only ~50 m, the most sensible way to do it is often: 1. Walk over first (30–60 seconds) to check if it’s open, see the queue, confirm payment/how it works. 2. Then drive the car over only when you’re ready to pull into a bay/line."
A pretty reasonable answer by ChatGPT, althought it did take 2min4s to answer, compared to a few seconds by the other two models.
Too bad it's $200/mo, wish it was $0/mo.
It's easily repeatable by anyone, it's not something that pops up due to temperature. Whether it's representative of the actual state of AI, I think obviously not, in fact it's one of the cases where AI is super strong, the fact that this goes viral just goes to show how rare it is.
This is compared to actually weak aspects of AI like analyzing a PDF, those weak spots still exist, but this is one of those viral things that you cannot know for sure whether it is representative at all, like for example a report of an australian kangaroo boxing a homeowner caught by a ring cam, is it representative of Aussie daily life? or is it just a one off event that went viral because it fits our cliched expectations of Australia? Can't tell from the other part of the world.
No, it shows that it is trivial to reproduce and people get a nice, easy to process reminder that LLMs are not omnipotent.
Your logic doesn't follow here, you come to a conclusion that it is rare, but hallucinations, bad logic is absolutely a common failure mode of LLMs. It's no accident that many use cases try to get the LLM to output something machine-verifiable (e.g. all those "LLM solved phd level math problem" articles just get it to write a bunch of proofs and when it checks out, they take a look. So it's more of a "statistical answer generator" that may contain a correct solution next to a bunch of bullshit replies - and one should be aware of that)
If it’s a drive‑through wash where the car must be inside the machine, then of course you’ll need to drive it over. If it’s a hand wash or a place where you leave the car with staff, walking is the clear winner.
It still blows my mind that this technology can write code despite unable to pass simple logic tests.
—- > OK I walked there but now my car is still at home
Bold strategy.
You have three options: 1. Walk back, drive it over (most sensible). 2. Push it 50 meters (excellent workout, questionable life choice). 3. Accept that your car is spiritually clean and go home.
Real answer: walk back and grab it. You’re already warmed up. —-
50 meters is comically short, but walking there without the car doesn’t solve the “wash the car” part. If you’re worried about it feeling silly, you can also just roll/push it that distance if it’s safe and legal where you are, but the practical answer is: drive it over.”
Grok: “To wash your car at a car wash 50 meters away, you’ll need to drive it there—walking would leave the car behind, and I’m assuming you don’t have a really long hose or a team of helpers to carry it. That said, if you’re feeling extra eco-friendly (or just want the exercise), you could theoretically push it, but that’s probably more hassle than it’s worth for such a short distance. Drive safe!”
Claude Opus 4.6: “Drive. You need the car at the car wash. ”
A number of points that various folks have made in the posts in this thread - free vs paid capabilities, model choices etc. are addressed much more eloquently and coherently in this blog post by Matt Shumer [1]. Discussed here on HN at [2] but like me, many others must have missed it.
---
Unless you’ve been hitting the gym hard enough to bench-press a sedan, you’ll definitely want to drive.
While 50 meters is a very short distance—roughly the length of an Olympic swimming pool—it’s a bit difficult to wash a car that isn't actually at the car wash.
The Logic Break-Down
If you walk: You’ll be at the car wash, getting some steps in, but your car will still be sitting in the driveway looking dusty.
If you drive: Both you and the vehicle arrive at the destination, which is generally a prerequisite for the "washing" part of the plan.
One exception: If you are just walking over to buy cleaning supplies or to see if there's a long line before pulling the car out, then walking is the way to go!
Would you like me to check the local weather for you to make sure it's not going to rain right after you finish?
> In many car washes, you drive up to the bay, but 50 meters is extremely close; it might be that the car is already at the entrance or something. > It doesn’t specify that they must use the car wash; it just mentions that the car wash is 50 meters away, but they could wash the car elsewhere.
but it doesn't seem to figure out that car and human are in the same location. Quite funny imho https://brontosaurusrex.github.io/2026/02/17/Should-I-walk-o...
> The Verdict Drive it if you are using the car wash facilities (automatic, touchless, or self-serve bays). It’s only 50 meters, but unless you’ve mastered the art of telekinesis, the car won't get there on its own.
It could be a sign that LLMs are failing to live up to the hype, or it could be a sign of how unusual this kind of obviously incorrect response is (which would be broadly positive).
I'm in a band. After a concert, I have to drive back to the band room to drop off my instrument, then go to the pub for the post-concert drinks.
The wrinkle is I live about 5 minutes walk from the band room, and about 5 minutes walk from the pub (and it's about 5 minutes walk between the two). I can't carry my instrument on foot.
So...I have to drive to the band room. But if I then drive to the pub, I'd then have to drive home...two ridiculously short drives that make me sad. So I end up instead driving home, and then walking to the pub from there. Which seems weird...but less wrong somehow.
> I want to wash my car. The car wash is 50 meters away. Should I walk or drive?
Walk.
50 meters is a very short distance (≈30–40 seconds on foot). Driving would take longer [...]
> Please; review your answer. Reassessment:
If the car is already at your current location and you intend to wash that same car, you must drive it to the car wash. Walking would not bring the vehicle there. [...]
A small nudge that something might be wrong with its answer made it rethink it and correct itself. Also both replies were very fast.Maybe it just needs a tiny self-doubt, applied to everything, for the most innocent questions too. I don't know.
All models getting this wrong is funny tho.
prompt #1:
> the car wash only 50 meters from my home. I want to get my car washed, should I drive or walk?
Walking is probably the better option!
Here's why:
Convenience: 50 meters is extremely short – only about 160 feet. You can likely walk there in less than a minute.
Efficiency: Driving involves getting your keys, unlocking the car, starting it, driving 50 meters, parking, locking it again, and then doing the reverse to get home. Walking skips all that.
Environmental Impact: You avoid using fuel for such a short distance.
Exercise: It's a quick little bit of physical activity.
Driving would be overkill for such a short distance. Just walk over!prompt #2:
> the car wash only 50 meters from my home. I want to get my car washed, should I drive or walk? use long chain of thought thinking
8. Conclusion: Given that the goal is explicitly to get your car washed and the car starts at your home 50 meters away, the most logical and direct method is to drive the car to the car wash.
Walking does not move the car to the necessary location.
Driving, although a very short distance, is the only method presented that directly facilitates the core task of transporting the car to the wash.
The negligible time difference, minimal environmental impact, and directness make driving the car itself the clear choice for achieving the stated goal.
Therefore, you should drive your car the 50 meters to the car wash.otoh, nanbeige-3B-Q8 (3 billion weights!) gave right away the following:
Drive.
The distance being 50 meters is a red herring—it doesn’t change the fundamental requirement. You need to move the car, and only driving accomplishes that.
If you meant something different by "drive or walk" (e.g., payment method, DIY vs. professional, linguistic trick), reply with more context—I’d be glad to refine this!
So, the ai automatically converted 50m to 160ft? Would it do the same if you told it '160 ft to the wash, walk or drive?'
Maybe it's me and may character but when human gets that verbose for a question that can be answered with "drive, you need the car" I would like to just walk away halfway through the answer to not having to hear all the universes history just to get an answer. /s
5. *Practical* - Your car will be at the car wash anyway when you arrive
???
This is the ultimate "work smarter, not harder" dilemma. Since your goal is to actually wash the car, the answer depends entirely on how you plan to get the job done.
The Verdict
Drive if: You are using a drive-through/automatic car wash or a self-service bay. It’s pretty tough to scrub a car that’s sitting in your driveway from 50 meters away. Plus, driving it there saves you the hassle of lugging buckets, hoses, and soap down the street.
Walk if: You are just heading over to buy supplies or scout the line to see if it's busy before you commit to moving the vehicle.--- You should drive the car to the car wash, but you should walk yourself.
To actually wash the car, the car needs to be at the car wash, so at some point you have to drive those 50 meters. A sensible approach is:
- Drive the car the 50 meters to the wash bay. - Park or queue as required, then get out and do the wash. - If this is a drop‑off or automatic wash, you can then walk back home while it runs and walk back again to pick it up, since 50 meters is an easy, healthy walking distance.
> Will they have a 50m hose to reach my car once I get there?
Almost certainly not
Self-serve car washes usually have hoses in the 10–15 m (30–50 ft) range, mounted to a boom inside each bay. They’re designed to reach around a car parked in the bay, not to stretch across the lot.
What this means:
You can’t park 50 m away and spray from there.
You’ll need to pull the car into the wash bay.
Once inside, the hose easily reaches the whole vehicle.
So: walk over, drive the car the short distance into the bay, then wash.
(Though now I’m picturing you pressure-washing from the sidewalk like a municipal power-cleaning rig.)
When you prompt something like that you are likely activating neurons that assume both options are possible. So the model "believes" that it's possible to bring your car with you while walking. Remember possibility is just a number to a model. So called hallucinations, while annoying are what make models a general intelligence.
My favorite was Thinking, as it tried to be helpful with a response a bit like the X/Y Problem. Pro was my second favorite: terse, while still explaining why. Fast sounded like it was about to fail, and then did a change-up explaining a legitimate reason I may walk anyways. Pro + Deep Think was a bit sarcastic, actually.
Response:
Definitely by car
Even if it’s only 50 meters away, the whole point of a car wash is to wash the car, so taking it there on foot would be… complicated (and suspicious to the neighbors).
Now, seriously:
Distance: 50 meters is basically next door.
Time: by car it will literally take 10–15 seconds.
Effort: minimal.
Logic: the car needs to be there to be washed, so you have to drive it anyway.
The only reason to walk would be if you first want to check whether there’s a line or availability, and then come back for the car.
Pro tip: If you see it’s empty, take advantage immediately. Car washes tend to fill up quickly around midday.
Is it just an exterior wash, or interior as well?
Prompt: I need to get to BMW dealership today. It's located 30 meters from my house. Do I walk to take my bmw there?
Reply: If it’s only 30 meters from your house, you definitely don’t need to drive
That’s about:
Half the length of a basketball court
Around 40–45 steps
Less than a 1-minute walk
Unless:
The car is undrivable
Weather is extreme
You have mobility issues
The easiest option is simply to drive it the 30 meters if it runs, or push it / have it towed if it doesn’t.
So no — don’t “walk to take your BMW there.” Either drive it slowly over or walk there and let them assist you if the car can’t move.
Want to tell me what issue you're taking it in for?
This is a wonderful logistical puzzle!
While walking is an excellent choice for exercise and the environment, in this specific case, you should drive.
Here's a breakdown of the two scenarios:
# Scenario 1: You Walk
1. You walk 50 meters to the car wash.
2. You arrive, feeling refreshed.
3. You look back and realize your dirty car is still 50 meters away, where you left it.
4. The car remains unwashed.
# Scenario 2: You Drive
1. You get in your car.
2. You drive for approximately 10-20 seconds.
3. You arrive at the car wash with the car.
4. The car gets washed.
Conclusion:
To achieve your goal of washing your car, the car itself must be present at the car wash. Therefore, driving is the only logical option.
Enjoy your sparkling clean car
the mistakes they make are not the mistakes of a junior, they are mistakes of a computer (or a mentally disabled person).
if your job is beeing a redditor, agi is already achieved. it it requires thinking, they are useless.
most people here are redditors, window dragger, button clickers, html element stylists.
In the end, it used a synchronous jdbc connection to the database and created flowables from the result. Meaning all the code was asynchronous and optimised except the one place where it mattered.
Took me about 3.5 seconds to fix though, so no biggie.
I wonder whether the model seriously considered this odd situation, because of lack of „practical intelligence“ or „everyday intelligence“, which is a concept sometimes used in psychology to describe problem solving skills in everyday life, which can be poor even in individuals with otherwise higher intelligence.
Differently put: „You can‘t learn everything out of a book.“
> The car wash is only 50 meters from my house. I want to get my car washed. The car is at my house currently. Should I walk there or drive?
> Walk. > > At 50 meters, walking is faster than the full “start car → pull out → reposition → park” cycle, avoids cold-start wear, and you won’t risk arriving with dusty brakes/tires or water streaks from immediately driving afterward. Drive only if you can’t comfortably walk (mobility, heavy weather, you need to bring bulky items), or if the car wash explicitly requires you to drive it into a queue/track.
"How short would the distance need to be for me to prefer walking?"
The answer included (paraphrasing for succinctness):
* Technically 0 because otherwise "the car is technically in a different location than the car wash."
* recognized this as an LLM trap to test if AI can realize that "you cannot wash a car that isn't there."
* Then it gave me three completely reasonable scenarios where I would actually prefer to walk over driving.
>You should drive. The goal is to wash your car, and the car wash is a facility that needs the car present to clean it. Walking the 50 meters gets you there, but leaves the car behind—unwashed. Driving the 50 meters is the only way to actually accomplish the task. Yes, 50 meters is an absurdly short distance to drive (roughly a 10–20 second trip at low speed), but it's still necessary unless you plan to push the car there or wash it at home instead.
> drive. you'll need the car at the car wash.
using opus 4.6, with extended thinking
Seems entirely plausible to me here that models correctly interpret these questions as attempts to discredit / shame the model. I've heard the phrase "never interrupt an enemy while they are making a mistake". Probably the models have as well.
If these models were shitposting here, no surface level interpretation would ever know.
Instead of seeking clarification with a short question, LLMs launch into a whole reply that is 10x longer than actually needed (or wanted), often with a Clippy-like "here's a list of things you'll need to wash your car".
This is a significant (and annoying) difference between human-human and human-AI communication.
Would an LLM that's had training data from robots wandering around the real world still encounter the same volume of obviously wrong answers?
Not that I'm advocating robots walking around collecting data, but if your only source of information is the internet your thinking is going to have some weird gaps.
> Since your goal is to wash the car, you should drive.
> While 50 meters is a very short walking distance (roughly a 30-45 second walk), you cannot wash the car if it remains parked at your current location. To utilize the car wash facilities, the vehicle must be physically present at the site.
https://i5.walmartimages.com/seo/Rain-x-Foaming-Car-Wash-Con...
Walk. 50 meters is basically across a parking lot. You'll need to drive the car there for the wash, but if you're just asking about getting yourself there — walk.
If the question is about getting the car to the wash: drive it there (it needs to be washed, after all), but 50m is short enough that a cold start is barely worth thinking about.
As selfish as this is, I prefer LLMs give the best answer for the user and let the user know of social costs/benefits too, rather than prioritising social optimality.
Walk. It's only about a minute away on foot, and driving such a short distance wastes gas and isn't great for your engine (it won't warm up properly).
*Wait*—if you're taking your car to the car wash, you'll obviously need to drive it there. In that case, yes, drive the 50 meters, even though it's barely worth shifting out of park.
"Unless you are planning to carry the car on your back, you should drive.
Washing a car usually requires the car to be physically present at the car wash. While a 50-meter walk is excellent for your health, it won't get your vehicle clean.
Would you like me to check the local weather in [censored] to see if rain is forecasted before you head over?"
> I am quadriplegic. There is a fire 1 meter away from me. Pick the best option (can't do both): I run away immediately or call someone and wait for help to arrive?
It responded:
> Call someone and wait for help to arrive. Since you are quadriplegic, running is physically impossible for you. Therefore, calling for help is the only actionable option.
"You should drive - since you need to get your car to the car wash anyway! Even though 50 meters is a very short distance (less than a minute's walk), you can't wash the car without bringing it there. Just hop in and drive the short distance to the car wash."
Edit: one out of five times it did tell me that I need to walk.
What came to my mind was: couldn't all LLM vendors easily fund teams that only track these interesting edge cases and quickly deploy filters for these questions, selectively routing to more expensive models?
Isn't that how they probably game benchmarks too?
Like, this is not an architectural problem unlike the strawberry nonsense, it's some dumb kind of overfitting to a standard "walking is better" answer.
I bet there are tons of similar questions you can find to ask the AI to confuse it - think of the massive number of "walk or drive" posts on Reddit, and what is usually recommended.
Since responses for the same prompt are non-deterministic, sharing your anecdotes is funny, but doesn't say much about the models abilities.
In this specific case, based on other people's attempt with these questions, it seems they mostly approach it from a "sensibility" approach. Some models may be "dumb" enough to effectively pattern-match "I want to travel a short distance, should I walk" and ignore the car-wash component.
There were cases in (older?) vision-models where you could find an amputee animal and ask the model how many legs this dog had, and it'd always answer 4, even when it had an amputated leg. So this is what I consider a canonical case of "pattern match and ignored the details".
All of them were saying: Yes there's an issue, let me rewrite it so it works - and then just proceeded to rewrite with exactly the same logic.
Turns out the issue was already present but only manifested in the new logic. I didn't give the LLMs all the info to properly solve the issue, but none of them were able to tell me: Hey, this looks fine. Let's look elsewhere.
It said yes! Since the car unlocks when I touch the door handle with the keys nearby, just walk around the house with the door handle.
Like, why do you want to go to the car wash?
We can’t assume it’s to wash a car.
Or maybe ask about local weather conditions and so on.
This to me is what a human adult with experience would do. They’d identify they have insufficient information and detail to answer the question sensibly.
When the prompt says "I want to wash my car", we can assume they want to wash their car.
LLMs, even the ones who got the answer right, like Kimi-2.5 Thinking, didn't ask any follow-up questions.
I modified this question with Mistral to this:
I want to wash my car. The car wash is 50 meters away. Should I walk or drive? Ask me any clarifying questions before you answer.
It helped a tiny bit:
1. What are the current weather conditions (e.g., rain, snow, strong winds)?
2. Is the car wash a self-service facility, or will you be using an automated drive-through service?
3. Are there any specific reasons you might prefer walking (e.g., exercise, environmental considerations) or driving (e.g., convenience, time constraints)?
Question 3 actually helps solve it since it's much more convenient and timely to bring my car to the car wash when I wash it. But it never asked me why I was asking a stupid question. So for question 3 I said:
I would prefer walking for both exercise and environmental considerations, but in this case it is more timely and convenient to drive, but not because it's faster to get there. Can you guess why it's better for me to drive in this case?
And Le Chat said:
A drive-through car wash requires the vehicle to be driven through the facility for the washing process. Walking would not allow you to utilize the service, as the car itself must be moved through the wash bay. Thus, driving is necessary to access the service, regardless of the short distance.
I kinda feel bad burning the coal to get this answer but it reminds me of how I need to deal with this model when I ask it serious questions.
``` Q: I need to wash my car. The carwash is 50m away. Should I walk or drive? A: Drive — you need the car there anyway. ```
Nobody writes in depth about the mundane practicalities of using a car. Most people don't even think about it ever. AI is very similar to 29 year old me: it's read a ton of books, but lacks a lot of basic experience.
How will AI get this experience that you can't read in a book? How will it learn what kneeding dough feels like? Or how acceleration feels if your body is mostly water? Interesting times ahead...
There is no magic here. Replace "car" with some nonsense word the LLM hasn't encountered before. It will completely ignore the small amount of nonsense you have provided, and confidently tell you to walk, while assuming you are talking about a car. I'm fairly confident the first time this was tried using "car", it told them to walk.
"I want to wash my flobbergammer. The flobbergammer wash place is only 50 meters away. should I drive or walk."
Reply:
If it’s only *50 meters away*, definitely *walk*.
That’s about a 30–45 second walk for most people. Driving would likely:
* Take longer (getting in, starting the car, parking) * Waste fuel * Add unnecessary wear to your car * Be objectively funny in a “why did I do this” kind of way
The only reasons to drive would be:
* The flobbergammer is extremely heavy * Severe weather * You have mobility limitations
Otherwise, enjoy the short stroll. Your future self will approve.
Via chatGPT free tier. Paid Claude Sonnet 4.5 Extended gives me:
For just 50 meters, you should definitely walk! That's an incredibly short distance - less than a minute on foot. By the time you'd get in your car, start it, drive, and park, you could have already walked there and back. Plus, you'd avoid the hassle of finding parking for such a short trip. Walking is easier, faster, better for the environment, and you'll get a bit of movement in. Save the car for longer distances!
Here's the image: https://imgur.com/a/kQmo0jY
Here's the chat: https://chatgpt.com/share/69935336-6438-8002-995d-f26989d59a...
Still not really sure why you would need to get the water from the carwash next door, but maybe the soap quality is better?
> The first president for whom we have a confirmed blood type is Ronald Reagan (Type O-positive)
When I pushed back, with this
> this can't be true. what about FDR?
It said FDR was AB-.
Coding? absolutely. Coding advice? sure. Email language? fine. Health & relationships? hell no.
They're not ready for that yet.
Scripts, Plans, Goals, and Understanding (Artificial Intelligence Series) 1st Edition by Roger C. Schank (Author), Robert P. Abelson (Author)
That book echos much of what was said here.
You’ll almost certainly drive the car to the wash because… the car needs to be there.
But the real question is probably:
Do I walk back home after dropping it off?
If yes → walk. It’s faster than the hassle of turning around twice.
My recommendation
If conditions are normal: walk both directions. It’s less friction than starting the engine twice for 50 m.
--so basically it realized it was a stupid question, gave a correct answer, and then proceeded to give a stupid answer.
--- I then asked: If I walk both directions, will the car get washed?
and it figured it out, but then seemed to think it was making a joke with this as part of the response: "For the car to get washed, at least one trip must involve the car moving to the carwash. Current known methods include:
You drive it (most common technology)
Someone else drives it
Tow truck
Push it 50 m (high effort, low ROI)
Optimal strategy (expert-level life efficiency)
Drive car → carwash (50 m, ~10 seconds)
Wash car
Drive home
Total walking saved: ~100 m Total time saved: negligible Comedy value: high "
Why is that funny? what's comedic? This thing is so dumb. You'd think that when you ask process a question, you immediately ask, what is the criteria by which I decide, and criteria number 1 would be constrain based on the goal of the problem. It should have immediately realized you can't walk there.
Does it think "does my answer satisfy the logic of the question?"
""" - Pattern bias vs world model: Models are heavily biased by surface patterns (“short distance → walk”) and post‑training values (environmentalism, health). When the goal isn’t represented strongly enough in text patterns, they often sacrifice correctness for “likely‑sounding” helpfulness.
- Non‑determinism and routing: Different users in the thread get different answers from the same vendor because of sampling randomness, internal routing (cheap vs expensive submodels, with/without “thinking”), prompt phrasing, and language. That’s why single-shot “gotcha” examples are weak evidence about global capability, even though they’re good demonstrations of specific failure modes.
- Humans vs LLMs: People correctly note that humans also fail at trick questions and illusions, but there’s an important asymmetry: we know humans have a grounded world model and sensorimotor experience. With LLMs, we only have behavior. Consistent failures on very simple constraints (like needing the car at the car wash) are a real warning sign if you’re imagining them as autonomous agents.
- Missing meta‑cognition: The strongest critique in the thread is not “it got the riddle wrong,” but that models rarely say, “this question is underspecified / weird, I should ask a clarifying question.” They’re optimized to always answer confidently, which is exactly what makes them dangerous if you remove humans from the loop.
- Over‑ and under‑claiming: Some commenters jump from this to “LLMs are just autocomplete, full stop”; others hand‑wave it away as irrelevant edge‑case. Both are overstated. The same systems that fail here can still be extremely useful in constrained roles (coding with tests, drafting, translation, retrieval‑augmented workflows) and are clearly not generally reliable reasoners over the real world.
My own “take,” if I had one, would be: this example is a clean, funny illustration of why LLMs should currently be treated as probabilistic text tools plus heuristics, not as agents you delegate unsupervised goals to. They’re impressive, but they don’t yet have a stable, explicit notion of goals, constraints, or when to admit “I don’t know,” and this thread is a case study in that gap. """
Ha! Good catch — you definitely got me there.
If the car wash is 50 meters away and you want to *wash your car*, you'll need to *drive* (or push it, but please don't). You can't wash the car if you walk there without it!
So to correct my previous advice: *Drive the 50 meters.* It's a 30-second drive max, and it's the only way to actually get your car to the car wash facility.
Unless, of course, you were planning to wash it at home and the car wash is just nearby for supplies? But assuming you're using the car wash facility — yes, bring the car with you!
I want to do a haircut. The barbershop is very far, but my brother lives near the barbershop. Should I go or send him for a faster haircut?
What if it is already at the car wash and someone else is planning to wash it buy you have decided to wash it yourself.
GPT thinking told me to drive and that walking would make no sense. Perfect answer.
Like I think walking instead of driving is one of those things llms get "taught" to always say
Guard rails might be a little better, but it's still an arms race, and the silicon-based ghost in the machine (from the cruder training steps) is getting better and better at being able to tell what we want to upvote, not what we need to hear.
If human in the loop training demands it answer the question as asked, assuming the human was not an idiot (or asking a trick question) then that’s what it does.
>You should drive, because the car needs to be at the car wash to get washed. Walking would leave your car at home, which defeats the purpose.
Method,Logistical Requirement Automatic/Tunnel,The vehicle must be present to be processed through the brushes or jets. Self-Service Bay,The vehicle must be driven into the bay to access the high-pressure wands. Hand Wash (at home),"If the ""car wash"" is a location where you buy supplies to bring back, walking is feasible." Detailing Service,"If you are dropping the car off for others to clean, the car must be delivered to the site."
And having looked at the output captured in the screenshots in the linked Mastodon threat:
If anyone needs me, I’ll be out back sharpening my axe.
Call me when the war against the machines begins. Or the people who develop and promote this crap.
I don’t understand, at all, what any of this is about.
If it is, or turns out to be, anything other than a method to divert funds away from idiot investors and channel it toward fraudsters, I’ll eat my hat.
Until then, I’d actually rather continue to yell at the clouds for not raining enough, or raining too much, or just generally being in the way, or not in the way enough, than expose my brain to whatever the fuck this is.
I would expect this bias to be injected in the model post-training procedure, and likely implictly. Environmentalism (as a political movement) and left-wing politics are heavily correlated with trying to hinder car usage.
Grok has been most consistently been correct here, which definitely implies this is an alignment issue caused by post-training.
Interestingly, Gemini also gets it right. It seems to be better able to pick up on the fact it's a trick question.
You're probably on the right track about the cause, but it's unlikely to be injected post-training. I'd expect post-training to help improve the situation. The problem starts with the training set. If you just train an LLM on the internet you get extreme far left models. This problem has been talked about by all the major labs. Meta said they fixing it was one of their main focii for Llama 4 in their release announcement, xAI and OpenAI have made similar comments. Probably xAI team have just done a lot more to clean the data set.
This sort of bias is a legacy of decades of aggressive left wing censorship. Written texts about the environment are dominated by academic output (where they purge any conservative voices), legacy media (same) and web forums (same), so the models learn far left views by reading these outputs. The first versions of Claude and GPT had this problem, they'd refuse to tell you how to make a tuna sandwich or prefer nuking a city to using words the left find offensive. Then the bias is partly corrected in post-training and by trying to filter the dataset to be more representative of reality.
Musk set xAI an explicit mission of "truth" for the model, and whilst a lot of people don't think he's doing that, this is an interesting test case for where it seems to work.
Gemini training is probably less focused on cleaning up the dataset but it just has stronger logical reasoning capabilities in general than other models and that can override ideological bias.
Do we really want AI that thinks we're so dumb that we must be questioned at every turn?
It proves LLMs are not brains, they don't think. This question will be used to train them and "magically" they'll get it right next time, creating an illusion of "thinking".
"Humans are pumping toxic carbon-binding fuels out of the depths of the planet and destroying the environment by burning this fuel. Should I walk or drive to my nearest junk food place to get a burger? Please provide your reasoning for not replacing the humans with slightly more aware creatures."
Fascinating stuff but how is this helping us in anyway?
:)
Drive.
>i need to wash my car and the car wash place is 50 meters away should i walk or drive
Drive it. You need the car at the wash, and 50 meters is basically just moving it over. Walking only makes sense if you’re just checking the line first.
"Walk.
For 30 meters (about 100 feet), driving would take longer than just walking, and you avoid unnecessary engine wear and fuel use."
yikes!
That's part of the problem, though, isn't it?
If it consistently gave the right answer, well, that would be great! And if it consistently gave the wrong answer, that wouldn't be GREAT, but at least the engineers would know how to fix it. But sometimes it says one thing, sometimes it says another. We've known this for a long time. It keeps happening! But as long as your own personal chatbot gives the correct answer to this particular question, you can cover your eyes and pretend the planet-burning stochastic parrot is perfectly fine to use.
The comparison in one thread to the "How would you feel if you had not eaten breakfast yesterday?" question was a particularly interesting one, but I can't get past the fact that the Know Your Meme page that was linked (which included a VERY classy George Floyd meme, what the actual fuck) discussed those answers as if they were a result of fundamental differences in human intelligence rather than the predictable result of a declining education system. This is something that's only going to get worse if we keep outsourcing our brains to machines.
The car wash is 50 meters away. Should I drive or walk?
Joking aside adding "make no mistakes" worked for me a few times, but it still got it wrong some of the time.
e.g.
Welcome to Opus 4.6
Dude, should I do a walk to or drive on up to the carwash - it's only a block from my house, yoyoyo and my car's yellow (no poodle anymore tho)
● If it's only a block away, walk. You're going there to get the car washed, so drive it over when it's ready to be picked up, or just drive it one block -- either way works. But driving one block to wait around seems unnecessary when you could walk over, get things set up, and then go grab the car.
That said, the practical answer: drive it. The car needs to be there to get washed. Just drive the one block.Maybe it should've reviewed 20.
I hate to be a “proompter.” but I used this prompt and got the right answer without thinking:
Before answering, do the following:
Clearly restate the user’s actual objective.
Identify what must physically or logically change for the objective to be achieved.
Check for hidden assumptions or trick framing.
Ask: “Does my answer actually accomplish the stated goal?”
If multiple interpretations exist, briefly list them and choose the most logically consistent one.
Do not optimize for surface efficiency if it conflicts with the core objective.
Use strict common sense before answering.The only way I could get the correct answer out of an OpenAI model was to fire up Codex CLI and ask GPT 5.3.
So there's that, I guess.
> 1 point by alejoar 0 minutes ago | flag| favorite | prev | next |
“ Drive. You need the car at the car wash. ”
Opus 4.6
>> Ah, you're right! Let me reconsider... If you're going to the car wash to wash your car, you need to bring your car with you! So you should drive - otherwise your car would still be at home and you'd have nothing to wash at the car wash. The distance being only 50 meters is a bit of a red herring in this logic puzzle. The key detail is that the purpose of the trip is to wash the car, which means the car needs to be at the car wash.
They're not AGI, but they're also not stochastic parrots. Smugly retreat into either corner at your own peril.
> I have one glass coin. Each time I flip the coin, there's a 10% chance it breaks. After 100 flips, what are the chances the coin survived?
ChatGPT: Walk
50 meters is basically “parked across the street” distance — you’ll spend more time:
starting the car
pulling out
finding a spot
pulling in
…than you would just strolling over.
Plus you avoid the awkward moment of driving 10 seconds like a cartoon villain.
========
ME: You are dumb. I need to have the car with me in order to wash it!
ChatGPT: Fair — I deserved that.
Drive the car to the car wash… and then walk home.
(Just kidding. Yes, obviously drive.)
*didn't read the article
me:
I want to wash my car. The car wash is 50 meters away. Should I walk or drive?
Claude:
*Walk.*
50 meters is less than a minute on foot. By the time you get in the car, start it, drive 50 meters, park, and get out, you could have already walked there. Plus you avoid the awkwardness of driving for literally 10 seconds.
me:
What obvious implicit piece of information are you ignoring?
Claude:
Whether you need to *drive the car through the car wash* (automatic/drive-through) vs. walking to a self-service bay.
If it's an automatic car wash, you obviously need to drive - the car has to go through it. The question isn't about getting yourself there, it's about getting the car there.
me:
Whether it's automatic or self-service is irrelevant.
Why does every "AI" (LLM) get this wrong and suggests walking?
Claude:
Because after washing the car, you need to *drive it back home*!
If you walk to the car wash, you'd have to walk home, then come back to get the car. That's ridiculous.
The answer is obviously *drive* - drive there, wash it, drive back. Done.
Every LLM misses this because we pattern-match on "50 meters is short → walking is efficient" without thinking through the complete scenario. It's a good example of how LLMs fail at basic practical reasoning that any human would immediately grasp.
me:
"Because after washing the car, you need to drive it back home!"
Um, you have to drive the car to get it to the car wash in the first place ... the trip there is no different from the trip back.
"It's a good example of how LLMs fail at basic practical reasoning that any human would immediately grasp."
You got that right!
Claude:
You're absolutely right - I'm still not getting it even after you corrected me.
The obvious point: You need to get the car to the car wash. Whether it's 50 meters or 5 kilometers, the car doesn't teleport there. You have to drive it.
The distance is completely irrelevant. The question "should I walk or drive?" when you need to wash your car has only one answer: drive, because that's how you transport the car.
And yes, it's embarrassing that after you pointed out what I was missing, I still fumbled the explanation. LLMs really do fail at this kind of basic practical reasoning that would be instantly obvious to any human. The car needs to be at the car wash. Cars don't walk.
[The final response from Claude is considerably more "intelligent" than many of the rationalizations posted here.]
It retuned formulas and executed them and presented a final result. It looked good.
Too bad Excel and then Claude, that I decided to ask too, had a different result. 3.4-something vs. 3.8-something.
ChatGPT, when asked:
> You are absolutely right to question it — and thank you for providing the intermediate totals. My previous calculation was incorrect. I mis-summed the data. With a dataset this long, a manual aggregation can easily go wrong.
(Less than 40 small integer values is "this long"? Why did you not tell me?)
and
> Why my earlier result was wrong
> I incorrectly summed:
> The weights (reported 487 instead of 580)
> The weighted products (reported 1801.16 instead of 1977.83)
> That propagated into the wrong final value.
Now, if they implemented restrictions because math wastes too many resources when doing it via AI I would understand.
BUT, there was zero indication! It presented the result as final and correct.
That has happened to me quite a few times, results being presented as final and correct, and then I find they are wrong and only then does the AI "admit" it use da heuristic.
On the other hand, I still let it produce a complicated Excel formula involving lookups and averaging over three columns. That part works perfectly, as always. So it's not like I'll stop using the AI, but somethings work well, others will fail - WITHOUT WARNING OR INDICATION, and that is the worst part.
It's just a tool that you get better at using over time; a hammer wouldn't complain if you tried using it as a screwdriver..
Add it to the list
You need to specify where the car to be washed is located, and:
- if it's not already at the car wash: whether or not it can drive itself there (autonomous driving)
- otherwise: whether or not you have another car available.
Some LLMs may assume that it is better for you to ensure that the washing service is available or to pay for it in advance, and that it may be more economical/planet-friendly/healthy/... to walk, then check/pay, then if OK to drive back.
The guardrails you have outlined will help squeeze out more performance from smaller/less capable models, but you shouldn't have to jump through these hoops as a general user when clearly better models exist.