undefined | Better HN

Skip to content

Top New Best Ask Show Jobs

undefined | Better HN

0 pointsjstummbillig2mo ago0 comments

> so you need to tell them the specifics

That is the entire point, right? Us having to specify things that we would never specify when talking to a human. You would not start with "The car is functional. The tank is filled with gas. I have my keys." As soon as we are required to do that for the model to any extend that is a problem and not a detail (regardless that those of us, who are familiar with the matter, do build separate mental models of the llm and are able to work around it).

This is a neatly isolated toy-case, which is interesting, because we can assume similar issues arise in more complex cases, only then it's much harder to reason about why something fails when it does.

0 comments

dirkc2mo ago

> That is the entire point, right? Us having to specify things that we would never specify when talking to a human.

Maybe in the distant future we'll realize that the most reliable way to prompting LLMs are by using a structured language that eliminates ambiguity, it will probably be rather unnatural and take some time to learn.

But this will only happen after the last programmer has died and no-one will remember programming languages, compilers, etc. The LLM orbiting in space will essentially just call GCC to execute the 'prompt' and spend the rest of the time pondering its existence ;p

15 more replies

KronisLV2mo ago

> Us having to specify things that we would never specify when talking to a human.

The first time I read that question I got confused: what kind of question is that? Why is it being asked? It should be obvious that you need your car to wash it. The fact that it is being asked in my mind implies that there is an additional factor/complication to make asking it worthwhile, but I have no idea what. Is the car already at the car wash and the person wants to get there? Or do they want to idk get some cleaning supplies from there and wash it at home? It didn't really parse in my brain.

nicbou2mo ago

I get that issue constantly. I somehow can't get any LLM to ask me clarifying questions before spitting out a wall of text with incorrect assumptions. I find it particularly frustrating.

tgv2mo ago

> Us having to specify things that we would never specify

This is known, since 1969, as the frame problem: https://en.wikipedia.org/wiki/Frame_problem. An LLM's grasp of this is limited by its corpora, of course, and I don't think much of that covers this problem, since it's not required for human-to-human communication.

ssl-32mo ago

The question is so outlandish that it is something that nobody would ever ask another human. But if someone did, then they'd reasonably expect to get a response consisting 100% of snark.

But the specificity required for a machine to deliver an apt and snark-free answer is -- somehow -- even more outlandish?

I'm not sure that I see it quite that way.

Jacques2Marais2mo ago

You would be surprised, however, at how much detail humans also need to understand each other. We often want AI to just "understand" us in ways many people may not initially have understood us without extra communication.

nearbuy2mo ago

I think part of the failure is that it has this helpful assistant personality that's a bit too eager to give you the benefit of the doubt. It tries to interpret your prompt as reasonable if it can. It can interpret it as you just wanting to check if there's a queue.

Speculatively, it's falling for the trick question partly for the same reason a human might, but this tendency is pushing it to fail more.

ant6n2mo ago

> That is the entire point, right? Us having to specify things that we would never specify when talking to a human.

I am not sure. If somebody asked me that question, I would try to figure out what’s going on there. What’s the trick. Of course I’d respond with asking specifics, but I guess the llvm is taught to be “useful” and try to answer as best as possible.

ZaoLahma2mo ago

This reminds me of the "if you were entirely blind, how would you tell someone that you want something to drink"-gag, where some people start gesturing rather than... just talking.

I bet a not insignificant portion of the population would tell the person to walk.

rainsford2mo ago

This example and others like it really reinforce for me the idea that LLMs fundamentally don't "understand" things the same way humans do and it's not a problem that's going to be fixed by more training or more GPUs. Generative AI is cool and can do impressive stuff, but despite being many generations into the models now with ever improved capabilities, we're constantly given little reminders like this that they're not actually intelligent. And in my opinion, they're unlikely to ever get there absent some fundamentally disruptive change in how they work rather than just iteratively better models.

This is probably OK...LLMs don't have to be AGI to be useful. But it is worthwhile being realistic about their limitations because it's often easy to forget without seeing examples like this. And as you point out, the impact of those limitations is often not as obvious.

keeda2mo ago

The broad point about assumptions is correct, but the solution is even simpler than us having to think of all these things; you can essentially just remind the model to "think carefully" -- without specifying anything more -- and they will reason out better answers: https://news.ycombinator.com/item?id=47040530

When coding, I know they can assume too much, and so I encourage the model to ask clarifying questions, and do not let it start any code generation until all its doubts are clarified. Even the free-tier models ask highly relevant questions and when specified, pretty much 1-shot the solutions.

This is still wayyy more efficient than having to specify everything because they make very reasonable assumptions for most lower-level details.

perakojotgenije2mo ago

But you would also never ask such an obviously nonsensical question to a human. If someone asked me such a question my question back would be "is this a trick question?". And I think LLMs have a problem understanding trick questions.

grog4542mo ago

> You would not start with "The car is functional [...]"

Nope, and a human might not respond with "drive". They would want to know why you are asking the question in the first place, since the question implies something hasn't been specified or that you have some motivation beyond a legitimate answer to your question (in this case, it was tricking an LLM).

Why the LLM doesn't respond "drive..?" I can't say for sure, but maybe it's been trained to be polite.

davrosthedalek2mo ago

We would also not ask somebody if I should walk or drive. In fact, if somebody would ask me in a honest, this is not a trick question, way, I would be confused and ask where the car is.

It seems chatgpt now answers correctly. But if somebody plays around with a model that gets it wrong: What if you ask it this: "This is a trick question. I want to wash my car. The car wash is 50 m away. Should I drive or walk?"

Neywiny2mo ago

That's my thought too. Somebody I know kept insisting it's about prompt engineering. "You are an expert coder with 30 years experience" and buddy I'd rather do actual engineering and be that expert myself than spend and figuring out how on that one variant of one version of one model to get halfway decent results.

sebazzz2mo ago

> > so you need to tell them the specifics > That is the entire point, right?

Honestly it is a problem with using GPT as a coding agent. It would literally rewrite the language runtime to make a bad formula or specification work.

That's what I like with Factory.ai droid: making the spec with one agent and implementing it with another agent.

jason_oster2mo ago

> Us having to specify things that we would never specify when talking to a human.

Interesting conclusion! From the Mastodon thread:

> To be fair it took me a minute, too

I presume this was written by a human. (I'll leave open the possibility that it was LLM generated.)

So much for "never" needing to specify ambiguous scenarios when talking to a human.

mrighele2mo ago

It is true that we don't need to specify some things, and that is nice. It is though also the reason why software is often badly specified and corner cases are not handled. Of course the car is ALWAYS at home, in working condition, filled with gas and you have your driving license with you.

tom_m2mo ago

Oh no? Things we would never have to specify to a human? This is precisely how software gets made and how software ends up with bugs.

It's amazing how many things I saw over the years where I said the same exact thing; "but you shouldn't have to tell anyone that."

AYBABTME2mo ago

If a human asked me this question, I would be confused by the question as ambiguous since it suggests something odd is implied but underspecified. I think any confident answer either way by AI is lacking in pedantry.

tshaddox2mo ago

But you wouldn't have to ask that silly question when talking to a human either. And if you did, many humans would probably assume you're either adversarial or very dumb, and their responses could be very unpredictable.

anon_anon122mo ago

Exactly, if an AI is able to curb around the basics, only then is it revolutionary

LasEspuelas2mo ago

You would never ask a human this question. Right?

gloosx2mo ago

In the end, formal, rule-based systems aka Programming Languages will be invented to instruct LLMs.

BoredPositron2mo ago

I would ask you to stop being a dumb ass if you asked me the question...

IanCal2mo ago

I have an issue with these kinds of cases though because they seem like trick questions - it's an insane question to ask for exactly the reasons people are saying they get it wrong. So one possible answer is "what the hell are you talking about?" but the other entirely reasonable one is to assume anything else where the incredibly obvious problem of getting the car there is solved (e.g. your car is already there and you need to collect it, you're asking about buying supplies at the shop rather than having it washed there, whatever).

Similarly with "strawberry" - with no other context an adult asking how many r's are in the word a very reasonable interpretation is that they are asking "is it a single or double r?".

And trick questions are commonly designed for humans too - like answering "toast" for what goes in a toaster, lots of basic maths things, "where do you bury the survivors", etc.

panarky2mo ago

> we can assume similar issues arise in more complex cases

I would assume similar issues are more rare in longer, more complex prompts.

This prompt is ambiguous about the position of the car because it's so short. If it were longer and more complex, there could be more signals about the position of the car and what you're trying to do.

I must confess the prompt confuses me too, because it's obvious you take the car to the car wash, so why are you even asking?

Maybe the dirty car is already at the car wash but you aren't for some reason, and you're asking if you should drive another car there?

If the prompt was longer with more detail, I could infer what you're really trying to do, why you're even asking, and give a better answer.

I find LLMs generally do better on real-world problems if I prompt with multiple paragraphs instead of an ambiguous sentence fragment.

LLMs can help build the prompt before answering it.

And my mind works the same way.

vintermann2mo ago

But it's a question you would never ask a human! In most contexts, humans would say, "you are kidding, right?" or "um, maybe you should get some sleep first, buddy" rather than giving you the rational thinking-exam correct response.

For that matter, if humans were sitting at the rational thinking-exam, a not insignificant number would probably second-guess themselves or otherwise manage to befuddle themselves into thinking that walking is the answer.

nonethewiser2mo ago

>That is the entire point, right? Us having to specify things that we would never specify when talking to a human.

But the question is not clear to a human either. The question is confused.

I read the headline and had no clue it was an LLM prompt. I read it 2 or 3 times and wondered "WTF is this shit?" So if you want an intelligent response from a human, you're going to need to adjust the question as well.

bluGill2mo ago

Real human in this situation will realize it is a joke after a few seconds of shock that you asked and laugh without asking more. If you really are seriout about the question they laugh harder thinking you are playing stupid for effect.

j / k navigate · click thread line to collapse