undefined | Better HN

0 pointswhatamidoingyo5mo ago0 comments

> when you ask an LLM to point to 'sources' for the information it outputs, as far as I know there is no guarantee that those are correct

A lot of times when I ask for a source, I get broken links. I'm not sure if the links existed at one point, or if the LLM is just hallucinating where it thinks a link should exist. CDN libraries, for example. Or sources to specific laws.

0 comments

nicbou5mo ago

I monitor 404 errors on my website. ChatGPT frequently sends traffic to pages that never existed. Sometimes the information they refer to has never existed on my website.

For example: "/glossary/love-parade" - There is no mention of this on my website. "/guides/blue-card-germany" has always been at "/guides/blue-card". I don't know what "/guides/cost-of-beer-distribution" even refers to.

cgsmith5mo ago

Definitely need an LLM to just generate it automatically on the fly! Welcome to the future! (Just kidding please don't (generate automatically))

tomsmeding5mo ago

Not quite this, but still relevant: https://www.ty-penguin.org.uk/~auj/spigot/

slaterbug5mo ago

A great idea if you're looking to intentionally sabotage AI.

CaptainOfCoit5mo ago

> A lot of times when I ask for a source,

They'll do pretty much everything you ask of them, so unless the text actually come from some source (via tool calls, injecting content into the context or other way), they'll make up a source rather than doing nothing, unless prompted otherwise.

instagib5mo ago

On my llm, I have a prompt that condenses down to:

For every line of text output, give me a full MLA annotated source. If you cannot then say your source does not exist or you are generating information based on multiple sources then give me those sources. If you cannot do that, print that you need more information to respond properly.

Every new model I mess with needs a slightly different prompt due to safeguards or source protections. It is interesting when it lists a source that I physically own and their training data is deteriorated.

vbezhenar5mo ago

They could make up source, but ChatGPT is an actual app with complicated backend, not dumb pipe between textedit and GPU. Surely they could verify on server side every link they output to user before including it in the answer. I'm sure Codex will implement it in no time!

therein5mo ago

They surely can detect it, but what are they going to do after detecting it? Loop the last job with a different seed and hope that the model doesn't lie through its teeth? They won't be doing it because the model will gladly generate you a fake source on the next retry too.

doikor5mo ago

This is actually harder then most think. The chances of your app doing this check being bot detected/blocked is very high.

(unless you are Google etc which are specifically let in to get the article indexed into search)

jacobgkau5mo ago

Maybe they should be trained on the understanding that making up a source is not "doing what you ask of them" when you ask for a source. It's actually the exact opposite of the "doing what you asked, not what you wanted" trope-- it's providing something it thinks you want instead of providing what you asked for (or being honest/erroring out that it can't).

ssivark5mo ago

Think for a second about what that means... this is a very easy thing to do IFF we already had a general purpose intelligence.

How do you make an LLM understand that it must only give factual sources? Just some naive RL with positive reward on the correct sources and negative reward on incorrect sources is not enough -- there are obscenely many more hallucinated sources possible, and the set of correct sources is a set of insanely tiny measure.

1 more reply

Paradigma115mo ago

Wrong, just ask it about some non existent famous historical person and it will most likely tell you it didnt exist.

lxgr5mo ago

If you need to ask for a source in the first place, chances are very high that the LLM's response is not based on summarizing existing sources but rather exclusively quoting from memory. That usually goes poorly, in my experience.

The loop "create a research plan, load a few promising search results into context, summarize them with the original question in mind" is vastly superior to "freely associate tokens based on the user's question, and only think about sources once they dig deeper".

j / k navigate · click thread line to collapse