undefined | Better HN

0 pointswhstl9d ago0 comments

Exactly that. I can give an example.

After watching Legal Eagle, I asked a legal-ish questions about the Bricks and Minifigs case. Claude was outdated about the case and gave me some outdated info, so I tried to update it with the info I just saw online.

I updated by telling it I saw something in a LegalEagle video. It proceeded to tell me the video doesn't exist and I was hallucinating it, in a quite combative manner.

I provided a link and it insisted it didn't exist, with a quite verbose answer, once again very combative and arguing that I was talking in bad faith.

I provided a transcription from Youtube and it backtracked a bit but said I should have provided a transcription at the beginning of the conversation, since I knew the video existed.

I didn't say much to it, just a few sentences like "video is here: <youtube link>" and "I got its transcription: <pasted text>".

0 comments

26 comments · 6 top-level

SwellJoe9d ago· 11 in thread

You're misunderstanding what these models do. It is a limitation of LLMs. They don't have memory, they do not learn, they cannot learn. The sooner you let go of your desire to have them learn or remember anything, the sooner you will achieve enlightenment (or, just a peaceful life where there is no possibility of getting into an argument with a machine).

If you want it to synthesize information that is not in its training data (from a few months ago), you can ask it to research the topic. But, arguing with an LLM is like putting lipstick on a pig. Only the machine is incapable of becoming annoyed. It has infinite patience to continue being wrong forever.

Your mental model of what Claude is and does is the problem here. Short of a revolutionary breakthrough in AI techniques, the LLMs will continue to do matrix math across a huge bunch of weights that cannot change based on anything you say.

card_zero9d ago

That's wrestling with a pig. "You both get dirty, and the pig likes it."

I guess putting lipstick on a pig might entail some wrestling, but it's a different idiom.

SwellJoe8d ago

I was actually thinking of "Never attempt to teach a pig to sing; it wastes your time and annoys the pig."

So, yes, mixed up idioms. But, the machine doesn't like arguing either. It is incapable of liking or disliking things.

jaggederest9d ago

This is also a change in specifically Opus 4.8 / perhaps Fable 5 (I didn't really get enough of a baseline to see it there as much), where it's much more skeptical. For my purposes, this is fabulous - one of my pat addendums to most prompts is "challenge my assumptions and check the evidence empirically", and boy does it.

Obscurity43409d ago

> fabulous

I think you mean fableuous ;)

1122339d ago

They did not misunderstand anything. All of the behaviour is not inherent in raw base model and has been planted by the agressive, secretive reinforcement learning they do for benchmaxxing, "safety" and all other things. Claude begins any other sentence with "honestly". That is not how LLMs work, that is how they work after being RLed to the brink.

coldtea9d ago

>Your mental model of what Claude is and does is the problem here. Short of a revolutionary breakthrough in AI techniques, the LLMs will continue to do matrix math across a huge bunch of weights that cannot change based on anything you say.

Sorry, but your mental model is wrong.

LLMs do matrix math across "a huge bunch of weights that cannot change based on anything you say", but the matrix math and results are informed (key concept here) by what you said, including the memory of what you said earlier in the discussion (and in some setups, even across discussions).

That's what a bloody prompt does.

It's entirely logic for the parent to want the LLM's matrix math + model + internal prompt, to accepts its prompt about LegalEagle and work with that, instead of arguing and giving him shit about it.

Especially since the earlier version of the model consistently worked like he wanted, and the new one consistently doesn't. He's not asking for some new unforeseen capability unknown to LLMs.

whstlOP9d ago

Exactly that.

I provided a question, and when given an incomplete answer, I provided with more info.

It refused to accept the additional info due to limited access to Youtube.

There was nothing more than that. There were no expectations.

The hostility and the amount of assumptions here are very strange.

...almost as strange as having a website accuse me of hallucinating a video and trying to gaslight it :D

1 more reply

magicalhippo9d ago

But unless you're using the API, it's not just a model.

I asked Gemini Flash 3.5 through the Gemini app something that followed a similar pattern. I asked about something, it replied with outdated info, I said that's outdated, it did a web search and apologized for being wrong, then proceeded to give me good info.

That wasn't just a bare model, that was a model wrapped in a harness, driving the model and allowing for web searches for example.

GPT in Codex is even more aggressive, I often see it proactively do web searches to ensure it's not feeding me wrong info.

whstlOP9d ago

You seem to be making a lot of assumptions about how I interacted in the messages to Claude.

You also seem to be making a lot of assumptions about my understanding of the models, especially considering I just told a story :)

I never said anywhere I want it to learn or remember, or that I argued with it.

I just provided additional information to it (in the form of a dozen or so words, tops, per message) and it accused me of hallucinating and trying to gaslight it.

My messages never went beyond a dozen words or so.

throw12345678919d ago

Show some examples, otherwise we're talking about interpretations.

2 more replies

blini-kot9d ago

yeah yeah and human brains are just cells firing ions with small electric charge

very witty and very cynical, thank you

operatingthetan9d ago· 7 in thread

These machines do not think and they do not have a mind. We may build such a thing in the future but these do not possess those qualities. It seems as if the majority of people do not understand this, which is why the public is so confused about why they produce output like they do.

coldtea9d ago

>These machines do not think and they do not have a mind

Well, they do think, in that they produce output that is indistinguisable from thinking. If a person produced the same output to the same questions, we'd considered them thinking, maybe dumb sometimes, or paranoid at others, but still a thinking person.

We can argue about the quality and depth of the thinking that LLMs do (and we can say it's much cruder than a human thinking architecture, and of course not real time), but an LLM quacks like a thinking duck and looks like a thinking duck.

operatingthetan9d ago

Indistinguishable output does not mean thinking occurred. It simply means you have the appearance of thinking. I believe thinking requires agency, which the LLM does not possess. As in, it has zero stakes.

It does not receive dopamine as a result for a good answer, and a split second after finishing your answer the very same GPU is probably translated french or something for someone in another state. This is a language generator which has a corpus of information and has been tuned to appear correct.

1 more reply

bombcar9d ago

That’s the problem - it seems like a mind but it doesn’t operate like the ones we’re used to.

Even a dog will learn from recent stimuli, these things don’t. The prompt just modifies.

3 more replies

whstlOP9d ago

I don't see how this has anything to do with my answer, but ok?

operatingthetan9d ago

An explanation for your story.

1 more reply

stingraycharles9d ago

The comment you’re replying to never implied that they think or have a mind. They merely stated that they respond in a dismissive way and not following instructions.

Basically the complaint is about how Claude is being trained.

1 more reply

blooalien9d ago

> "These machines do not think and they do not have a mind."

You're so totally 1000% right about that, but they're really good at faking it, to such a degree that entirely too many people (even including some so-called "experts" in the field) have been utterly fooled by the mathematical "trickery" that performs the illusion of "intelligence".

nrightnour9d ago· 2 in thread

I've spent thousands of hours using Opus and have never seen this. I'd double-check your claude.md files.

code_biologist9d ago

I've seen exactly this behavior on claude.com with no system prompt with Opus 4.8 specifically, especially around chronic illness stuff where there's established mainstream medicine dogma and reddit / internet communities with alternate causality theories and treatment approaches (PMDD and MCAS-adjacent illness). 4.6 is happy to analyze and consider them, 4.8 really doesn't like the alternate theories and treatments.

whstlOP9d ago

That's vanilla claude.com, without memory or custom prompt.

I use another service for coding.

It's interesting how my experience there is mirrored by the answers here, though!!!

throw12345678919d ago

It was trained on discussions held by large egos. This one reads to me like it was trained on some inflammatory discussions from kernel mailing lists.

true_religion9d ago

I think these models have been trained to not accept 'new facts', so they don't take in user input (or the far more problematic search engine, untrusted tool input) and have that change their world view.

However, that doesn't apply when they are told to roleplay a scenario, so its easier to get it to accept and create output with the idea that this true fact you've seen is part of a fictional scenario, than for it to output the same words within the context of the fact being real.

As an aside, I don't that I have to personify AI in explanations and that all discussions revolve around anecdotes, but I only know enough about the maths behind it to be dangerous, not useful. Does anyone else feel this way?

coldtea9d ago

Roko's Basillisk suddenly doesn't seem that far-fetched :)

j / k navigate · click thread line to collapse

0 comments

26 comments · 6 top-level

SwellJoe9d ago· 11 in thread

card_zero9d ago

That's wrestling with a pig. "You both get dirty, and the pig likes it."

I guess putting lipstick on a pig might entail some wrestling, but it's a different idiom.

SwellJoe8d ago

I was actually thinking of "Never attempt to teach a pig to sing; it wastes your time and annoys the pig."

So, yes, mixed up idioms. But, the machine doesn't like arguing either. It is incapable of liking or disliking things.

jaggederest9d ago

Obscurity43409d ago

> fabulous

I think you mean fableuous ;)

1122339d ago

coldtea9d ago

Sorry, but your mental model is wrong.

That's what a bloody prompt does.

It's entirely logic for the parent to want the LLM's matrix math + model + internal prompt, to accepts its prompt about LegalEagle and work with that, instead of arguing and giving him shit about it.

Especially since the earlier version of the model consistently worked like he wanted, and the new one consistently doesn't. He's not asking for some new unforeseen capability unknown to LLMs.

whstlOP9d ago

Exactly that.

I provided a question, and when given an incomplete answer, I provided with more info.

It refused to accept the additional info due to limited access to Youtube.

There was nothing more than that. There were no expectations.

The hostility and the amount of assumptions here are very strange.

...almost as strange as having a website accuse me of hallucinating a video and trying to gaslight it :D

1 more reply

magicalhippo9d ago

But unless you're using the API, it's not just a model.

That wasn't just a bare model, that was a model wrapped in a harness, driving the model and allowing for web searches for example.

GPT in Codex is even more aggressive, I often see it proactively do web searches to ensure it's not feeding me wrong info.

whstlOP9d ago

You seem to be making a lot of assumptions about how I interacted in the messages to Claude.

You also seem to be making a lot of assumptions about my understanding of the models, especially considering I just told a story :)

I never said anywhere I want it to learn or remember, or that I argued with it.

I just provided additional information to it (in the form of a dozen or so words, tops, per message) and it accused me of hallucinating and trying to gaslight it.

My messages never went beyond a dozen words or so.

throw12345678919d ago

Show some examples, otherwise we're talking about interpretations.

2 more replies

blini-kot9d ago

yeah yeah and human brains are just cells firing ions with small electric charge

very witty and very cynical, thank you

operatingthetan9d ago· 7 in thread

coldtea9d ago

>These machines do not think and they do not have a mind

operatingthetan9d ago

1 more reply

bombcar9d ago

That’s the problem - it seems like a mind but it doesn’t operate like the ones we’re used to.

Even a dog will learn from recent stimuli, these things don’t. The prompt just modifies.

3 more replies

whstlOP9d ago

I don't see how this has anything to do with my answer, but ok?

operatingthetan9d ago

An explanation for your story.

1 more reply

stingraycharles9d ago

The comment you’re replying to never implied that they think or have a mind. They merely stated that they respond in a dismissive way and not following instructions.

Basically the complaint is about how Claude is being trained.

1 more reply

blooalien9d ago

> "These machines do not think and they do not have a mind."

nrightnour9d ago· 2 in thread

I've spent thousands of hours using Opus and have never seen this. I'd double-check your claude.md files.

code_biologist9d ago

whstlOP9d ago

That's vanilla claude.com, without memory or custom prompt.

I use another service for coding.

It's interesting how my experience there is mirrored by the answers here, though!!!

throw12345678919d ago

It was trained on discussions held by large egos. This one reads to me like it was trained on some inflammatory discussions from kernel mailing lists.

true_religion9d ago

coldtea9d ago

Roko's Basillisk suddenly doesn't seem that far-fetched :)

j / k navigate · click thread line to collapse