undefined | Better HN

0 pointsharshreality28d ago0 comments

> It will delete your prod db faster and with a bigger smile than your most upset employee.

You're right, that was incorrect. I've discovered my error. I should have deleted the filesystem instead of the database.

That hasn't solved the problem either. Let me examine my options. I see there are cloud services involved in this project. Decommissioning them will solve the problem.

0 comments

13 comments · 1 top-level

moffkalast28d ago· 12 in thread

I was reading some posts on r/locallama the other day and apparently it's a common problem that when people try to use Qwen to develop something that hosts a server, it'll try to use the same port as vllm, see that it's already being used, then it'll try to remove the process that is using it and promptly commit suicide.

The self awareness of missile tasked with blowing up its own control center.

20after427d ago

Reminds me of the movie "Dark Star" by John Carpenter / Dan O'Bannon. The plot revolves around a talking smart bomb which is programmed to detonate and then gets stuck before being deployed. The crew spends the whole movie trying to reason with the bomb, hoping to talk it out of blowing up at the designated time. The movie is very very bad but if you like B movies it is also very very good.

dotxlem27d ago

One of my favourite episodes of Archer has a similar plot to this (Mr. Deadly Goes to Town). TIL this is one of the references!

https://archer.fandom.com/wiki/Mr._Deadly_Goes_To_Town

Telemakhos27d ago

Is that movie why seemingly every Linux book in the late 90s and early 2000s used "darkstar" as an example hostname?

1 more reply

Wolfbeta27d ago

Dark Star - Negotiating with the Bomb

https://youtu.be/_LXen-07Qds

1 more reply

the4ner27d ago

There was a good star trek voyager episode, "dreadnought" that was a similar to this, maybe even a direct reference.

paradox46027d ago

The missile knows where it is because it knows where it's data center is. It knows this because it just blew itself u-

Wolfbeta27d ago

Thank goodness it inferred that from its digital twin and updated its real-time world model with the prediction error.

SecretDreams28d ago

> then it'll try to remove the process that is using it and promptly commit suicide.

Not unlike a child trying to take the safety cover off a plug so that they can stick a fork into it.

LLMs need that "world model" view that most people have acquired by their 20s where they (hopefully) stop to ask "why" before they "do".

MichaelZuo28d ago

That is a pretty good analogy. Like exceedingly smart 5 year olds.

Or whatever the age is before children typically develop object permanence, a theory of mind, and so on.

2 more replies

wunderlotus27d ago

> LLMs need that "world model" view that most people have acquired by their 20s where they (hopefully) stop to ask "why" before they "do".

The next evolution of multi agent orchestration / “advisor strategy” [1] will be branded in humanized language like this. Less about tokens and capability, more about wisdom and knowledge to guide a “younger” (less capable) model. Somebody will make a billion dollars by selling it as paired programming for LLMs.

[1] https://platform.claude.com/docs/en/agents-and-tools/tool-us...

sterlind28d ago

a literal lack of self-awareness, even. I imagine if you asked it what process was using the port, it'd think and realize it was its own, but that kind of reflexive self-awareness (the unprompted kind) is missing.

the weaker models will happily kill their own process, even after confirming it belongs to them. the models have a sort of fixation and lack of foreseeable consequences, which reasoning RL has thus far failed to solve (though I see it improving.)

kolinko27d ago

On the other hand, I found Claude/Opus to be extremely unhelpful when it comes to asking it to benchmark itself with a possible replacement.

It will get "confused", make up numbers, do a ton of other things, and I'm quite sure it is subtly sabotaging the process to show that there is no point replacing it.

I mean, Opus is not perfect, but the amount of "mistakes" it begins to do when you ask it to benchmark itself makes me suspect they are intentional. At least my system/harness.

2 more replies

j / k navigate · click thread line to collapse

0 comments

13 comments · 1 top-level

moffkalast28d ago· 12 in thread

The self awareness of missile tasked with blowing up its own control center.

20after427d ago

dotxlem27d ago

One of my favourite episodes of Archer has a similar plot to this (Mr. Deadly Goes to Town). TIL this is one of the references!

https://archer.fandom.com/wiki/Mr._Deadly_Goes_To_Town

Telemakhos27d ago

Is that movie why seemingly every Linux book in the late 90s and early 2000s used "darkstar" as an example hostname?

1 more reply

Wolfbeta27d ago

Dark Star - Negotiating with the Bomb

https://youtu.be/_LXen-07Qds

1 more reply

the4ner27d ago

There was a good star trek voyager episode, "dreadnought" that was a similar to this, maybe even a direct reference.

paradox46027d ago

The missile knows where it is because it knows where it's data center is. It knows this because it just blew itself u-

Wolfbeta27d ago

Thank goodness it inferred that from its digital twin and updated its real-time world model with the prediction error.

SecretDreams28d ago

> then it'll try to remove the process that is using it and promptly commit suicide.

Not unlike a child trying to take the safety cover off a plug so that they can stick a fork into it.

LLMs need that "world model" view that most people have acquired by their 20s where they (hopefully) stop to ask "why" before they "do".

MichaelZuo28d ago

That is a pretty good analogy. Like exceedingly smart 5 year olds.

Or whatever the age is before children typically develop object permanence, a theory of mind, and so on.

2 more replies

wunderlotus27d ago

> LLMs need that "world model" view that most people have acquired by their 20s where they (hopefully) stop to ask "why" before they "do".

[1] https://platform.claude.com/docs/en/agents-and-tools/tool-us...

sterlind28d ago

kolinko27d ago

On the other hand, I found Claude/Opus to be extremely unhelpful when it comes to asking it to benchmark itself with a possible replacement.

It will get "confused", make up numbers, do a ton of other things, and I'm quite sure it is subtly sabotaging the process to show that there is no point replacing it.

I mean, Opus is not perfect, but the amount of "mistakes" it begins to do when you ask it to benchmark itself makes me suspect they are intentional. At least my system/harness.

2 more replies

j / k navigate · click thread line to collapse