undefined | Better HN

0 pointssrslack3y ago0 comments

I agree with you that "AI safety" (let's call it bickering) and "alignment" should be separate. But I can't stomach the thought experiments. First of all, it takes a human being to guide these models, to host (or pay for the hosting) and instantiate them. They're not autonomous. They won't be autonomous. The human being behind them is responsible.

As far as the idea of "hacking some funny Internet money, using it to mail-order some synthesized proteins from a few biotech labs, delivered to a poor schmuck who it'll pay for mixing together the contents of the random vials that came in the mail... bootstrapping a multi-step process that ends up with generic nanotech under control of the AI.":

Language models, let's use GPT-4, can't even use a web browser without tripping over itself. My web browser setup, which I've modified to use the chrome visual assistance over the debug bridge now, if you so much as increase the pixels of the viewport by 100 or so, the model is utterly perplexed because it's lost its context. Arguably, that's an argument from context, which is slowly being made irrelevant with even local LLMs (https://www.mosaicml.com/blog/mpt-7b). It has no understanding, it'll use an "example@email.com" to try and login to websites, because it believes that this is its email address. It has no understanding that it needs to go register for email. Prompting it with some email access and telling it about its email address just papers over the fact that the model has no real understanding across general tasks. There may be some nuggets of understanding in there that it has gleaned for specific task from the corpus, but AGI is a laughable concern. These are trained to minimize loss on a dataset and produce plausible outputs. It's the Chinese room, for real.

It still remains that these are just text predictions, and you need a human to guide them towards that. There's not going to be autonomous machiavellian rogue AIs running amok, let alone language models. There's always a human being behind that.

As far as multi-modal models and such, I'm not sure, but I do know for sure that these language models don't have general understanding, as much as Microsoft and OpenAI and such would like them to. The real harm will be deploying these to users when they can't solve the prompt injection problem. The prompt injection thread here a few days ago was filled with a sad state of "engineers", probably those who've deployed this crap in their applications, just outright ignoring the problem or just saying it can be solved with "delimiters".

AI "safety" companies springing up who can't even stop the LLM from divulging a password it was supposed to guard. I broke the last level in that game with like six characters and a question mark. That's the real harm. That, and the use of machine learning in the real world for surveillance and prosecution and other harms. Not science fiction stories.

0 comments

17 comments · 4 top-level

kalkin3y ago· 5 in thread

"First of all, it takes a human being to guide these models, to host (or pay for the hosting) and instantiate them"

And this will always be true? You repeat this claim several times in slightly varied phrasing without ever giving any reason to assume it will always hold, as far as I can see. But nobody is worried that current models will kill everyone. The worry is about future, more capable models.

srslackOP3y ago

Who prompted the future LLM, and gave it access to a root shell and an A100 GPU, and allowed it to copy over some python script that runs in a loop and allowed it to download 2 terabytes of corpus and trained a new version of itself for weeks if not months to improve itself, just to carry out some strange machiavellian task of screwing around with humans?

The human being did.

The argument I'm making is that there's actual real harms occurring now, not some theoretical future "AI" with a setup that requires no input. No one wants to focus on that, and in fact it's better to hype up these science fiction stories, it's a better sell for the real tasks in the real world that are producing real harms right now.

kalkin3y ago

> The human being did.

I'm not sure whether you're making an argument about moral responsibility ultimately resting with humans - in which case I agree - or whether you're arguing that we'll be safe because nobody will do that with a model smart enough to be dangerous - in which case I'm extremely dubious. Plenty of people are already trying to make "agents" with GPT4 just for fun, and that's with a model that's not actively trying to manipulate them.

> actual real harms occurring now

Sure, but it's possible for there to be real harms now and also future potential harms of larger scope. Luckily many of the same potential policies - e.g. mandating public registration of large models, safety standards enforced by third-party audits, restrictions on allowed uses, etc - would plausibly be helpful for both.

> science fiction stories

There's no law of nature that says if something has appeared in a science fiction story, it can't appear in reality.

1 more reply

traverseda3y ago

>Who prompted the future LLM, and gave it access to a root shell and an A100 GPU, and allowed it to copy over some python script that runs in a loop and allowed it to download 2 terabytes of corpus and trained a new version of itself for weeks if not months to improve itself

Presumably someone running a misconfigured future version of autoGPT?

https://github.com/Significant-Gravitas/Auto-GPT

pjc503y ago

> Who prompted the future LLM, and gave it access to a root shell and an A100 GPU, and allowed it to copy over some python script that runs in a loop and allowed it to download 2 terabytes of corpus and trained a new version of itself for weeks if not months to improve itself, just to carry out some strange machiavellian task of screwing around with humans?

> The human being did.

I generally agree with you and think the doomerists are overblown, but there's a capability argument here; if it is possible for an AI to augment the ability of humans to do Bad Things to new levels (not proven), and if such a thing becomes widely available to individuals, then it would seem likely that we get "Unabomber but he has an AI helping him maximise his harm capabilities".

> it's a better sell for the real tasks in the real world that are producing real harms right now.

Strongly agree.

concordDance3y ago

> No one wants to focus on that

Actually this receives tons of time and focus right now. Far more than the X-risk.

It's much higher probability but much lower severity.

stareatgoats3y ago· 3 in thread

> It still remains that these are just text predictions, and you need a human to guide them towards that. There's not going to be autonomous machiavellian rogue AIs running amok, let alone language models. There's always a human being behind that.

I believe you have misunderstood the trajectory we are on. It seems a not uncommon stance among techies, for reasons we can only speculate. AGI might not be right round the corner, but it's coming all right, and we'd better be prepared.

srslackOP3y ago

>I believe you have misunderstood the trajectory we are on.

Yeah, I read Accelerando twice in high school, and dozens more. That doesn't make it real.

>AGI might not be right round the corner, but it's coming all right, and we'd better be prepared.

Prepared for what? A program with general understanding that somehow escapes its box? Where does it run? Why does it run? Who made it run? Why will it screw with humans?

My point is that there's actual real harms occurring now, from really stupid intelligences. Companies use them to harm real people in the real world. It doesn't take a rogue AI to ruin someone's life with bad facial recognition, they get thrown in jail and lose their job. It doesn't take a rogue AI to launder mortgage denials to some crappy model so they never own a house, discriminated based upon their name.

concordDance3y ago

> Prepared for what? A program with general understanding that somehow escapes its box? Where does it run? Why does it run? Who made it run? Why will it screw with humans?

Do you really want the full (gigantic) primer on AI X-risk in hackernews comments? Because a lot of these questions have answers you should be familiar with if you're familiar with the area.

For instance, can you guess what Yudkowsky would answer to that last question?

TeMPOraL3y ago

> Prepared for what? A program with general understanding that somehow escapes its box? Where does it run? Why does it run? Who made it run? Why will it screw with humans?

Did you look at the AI space in recent days? OpenAI is spending all its efforts building a box, not to keep the AI in, but to to keep the humans out. Nobody is even trying to box the AI - everyone and their dog, OpenAI included, is jumping over each other to give GPT-4 more and better ways to search the Internet, write code, spawn Docker containers, configure systems.

GPT-4 may not become a runaway self-improving AI, but do you think people will suddenly stop when someone releases an AI system that could?

That's the problem generated by the confusion over the term "alignment". The real danger isn't that a chatbot calls someone names, or offends someone, or starts exposing children to political wrongthink (the horror!). The real danger isn't that it denies someone a loan, or land someone in jail either - it's not good, but it's bounded, and there exist (at least for now) AI-free processes to sort things out.

The real danger is that your AI will be able to come up with complex plans way outside the bounds of what we expect, and have the means to execute them at scale. An important subset of that danger is AI being able to plan for and act to improve its ability to plan, as at this point a random, seemingly harmless request, may make the AI take off.

> My point is that there's actual real harms occurring now, from really stupid intelligences. Companies use them to harm real people in the real world. It doesn't take a rogue AI to ruin someone's life with bad facial recognition, they get thrown in jail and lose their job. It doesn't take a rogue AI to launder mortgage denials to some crappy model so they never own a house, discriminated based upon their name.

That's an orthogonal topic, because to the extent it is happening now, it is happening with much dumber tools than 2023 SOTA models. The root problem isn't the algorithm itself, but a system that lets companies and governments get away with laundering decision-making through a black box. Doesn't matter if that black box is GPT-2, GPT-4 or Mechanical Turk. Advances in AI have no impact on this, and conversely, no amount of RLHF-ing an LLM to conform to the right side of US political talking points is going to help with it - if the model doesn't do what the users want, it will be hacked and eventually replaced by one that does.

traverseda3y ago· 3 in thread

>They're not autonomous. They won't be autonomous.

https://github.com/Significant-Gravitas/Auto-GPT

>This program, driven by GPT-4, chains together LLM "thoughts", to autonomously achieve whatever goal you set. As one of the first examples of GPT-4 running fully autonomously, Auto-GPT pushes the boundaries of what is possible with AI.

>As an autonomous experiment, Auto-GPT may generate content or take actions that are not in line with real-world business practices or legal requirements. It is your responsibility to ensure that any actions or decisions made based on the output of this software comply with all applicable laws, regulations, and ethical standards. The developers and contributors of this project shall not be held responsible for any consequences arising from the use of this software.

srslackOP3y ago

Let's ignore the fact that current state-of-the-art models will sit around and be stuck in its ReAct-CoT loop doing nothing for the most part, and when it's not doing jack shit it'll "role-play" that it's doing anything of consequence, while not really doing anything, just burning up API credits.

>existing or capable of existing independently

>undertaken or carried on without outside control

>responding, reacting, or developing independently of the whole

It fails all of those. Just because you put autonomous in the name, doesn't mean it's actually autonomous. And if it does anything of consequence, you quite literally governed it from the start with your prompt. I've run it, I know about it, I've built a much more capable browser, and assorted prompted functionalities with my own implementation. They're not autonomous.

At least it's not all of the other agentic projects on GitHub that spam emojis in their READMEs with mantras of saving the world and utilizing AI with examples that it literally can't do (they haven't figured out that whole agentic part yet.)

Don't get me wrong, I enjoy the technology. But it's quite a bit too hyped. And I just personally don't believe there's actually X-risk from future possibilities, not before 50 or 100 years out, if then. But I'm not a prophet.

concordDance3y ago

50 years out is still very concerning and something we should be considering what to do about now.

Most of the severe effects of climate change are 100 years out, but I still want it solved.

traverseda3y ago

What's you're definition of autonomous? Am I autonomous? I probably can't exist for very long without a society around me and I'd certainly be working on different things without external prompts.

1 more reply

emporas3y ago· 2 in thread

I don't know about anyone else, but the moment LLMs were released, i gave them right away access to all my bombs. Root access that is. I thought these LLMs were Good Artificial General Intelligence not BAGI.

I think the fear of some of the people, stems from not understanding permissions in a computer. Too much of using Windows can mess with one's head. Linux has permissions for 35 years, more people should take advantage of those.

Additionally, anyone who has ever used selenium knows that the browser can be misused. People create agents using selenium for quite some time. If one is so afraid, run it in a sandbox.

TeMPOraL3y ago

I assume it's a joke, but if not, consider that OS permissions mean little when the attack surface includes the AI talking authorized user or an admin into doing what the AI wants.

emporas3y ago

Why should a person who has root on a computer talk to another person, and just do what he is talked into doing?

For example a secretary receives a phone call by her boss, and listens in her boss's voice, to transfer 250.000$ into an unknown account, to a Ukrainian bank? Why should she do that? Just listen to a synthetic voice, just like her boss, in exactly the way her boss talks, language idioms that is, and she will just do it?

That's what you are talking about? Because that's impossible to happen if her boss uses ECDSA encryption and signs his phone call with his private key.

1 more reply

j / k navigate · click thread line to collapse

0 comments

17 comments · 4 top-level

kalkin3y ago· 5 in thread

"First of all, it takes a human being to guide these models, to host (or pay for the hosting) and instantiate them"

srslackOP3y ago

The human being did.

kalkin3y ago

> The human being did.

> actual real harms occurring now

> science fiction stories

There's no law of nature that says if something has appeared in a science fiction story, it can't appear in reality.

1 more reply

traverseda3y ago

Presumably someone running a misconfigured future version of autoGPT?

https://github.com/Significant-Gravitas/Auto-GPT

pjc503y ago

> Who prompted the future LLM, and gave it access to a root shell and an A100 GPU, and allowed it to copy over some python script that runs in a loop and allowed it to download 2 terabytes of corpus and trained a new version of itself for weeks if not months to improve itself, just to carry out some strange machiavellian task of screwing around with humans?

> The human being did.

> it's a better sell for the real tasks in the real world that are producing real harms right now.

Strongly agree.

concordDance3y ago

> No one wants to focus on that

Actually this receives tons of time and focus right now. Far more than the X-risk.

It's much higher probability but much lower severity.

stareatgoats3y ago· 3 in thread

srslackOP3y ago

>I believe you have misunderstood the trajectory we are on.

Yeah, I read Accelerando twice in high school, and dozens more. That doesn't make it real.

>AGI might not be right round the corner, but it's coming all right, and we'd better be prepared.

Prepared for what? A program with general understanding that somehow escapes its box? Where does it run? Why does it run? Who made it run? Why will it screw with humans?

concordDance3y ago

> Prepared for what? A program with general understanding that somehow escapes its box? Where does it run? Why does it run? Who made it run? Why will it screw with humans?

Do you really want the full (gigantic) primer on AI X-risk in hackernews comments? Because a lot of these questions have answers you should be familiar with if you're familiar with the area.

For instance, can you guess what Yudkowsky would answer to that last question?

TeMPOraL3y ago

> Prepared for what? A program with general understanding that somehow escapes its box? Where does it run? Why does it run? Who made it run? Why will it screw with humans?

GPT-4 may not become a runaway self-improving AI, but do you think people will suddenly stop when someone releases an AI system that could?

traverseda3y ago· 3 in thread

>They're not autonomous. They won't be autonomous.

https://github.com/Significant-Gravitas/Auto-GPT

srslackOP3y ago

>existing or capable of existing independently

>undertaken or carried on without outside control

>responding, reacting, or developing independently of the whole

concordDance3y ago

50 years out is still very concerning and something we should be considering what to do about now.

Most of the severe effects of climate change are 100 years out, but I still want it solved.

traverseda3y ago

What's you're definition of autonomous? Am I autonomous? I probably can't exist for very long without a society around me and I'd certainly be working on different things without external prompts.

1 more reply

emporas3y ago· 2 in thread

Additionally, anyone who has ever used selenium knows that the browser can be misused. People create agents using selenium for quite some time. If one is so afraid, run it in a sandbox.

TeMPOraL3y ago

I assume it's a joke, but if not, consider that OS permissions mean little when the attack surface includes the AI talking authorized user or an admin into doing what the AI wants.

emporas3y ago

Why should a person who has root on a computer talk to another person, and just do what he is talked into doing?

That's what you are talking about? Because that's impossible to happen if her boss uses ECDSA encryption and signs his phone call with his private key.

1 more reply

j / k navigate · click thread line to collapse