undefined | Better HN

0 pointsphilipkglass3y ago0 comments

Banning blinding laser weapons was easy to get consensus on. Banning AI research is more like banning laser research.

I have read Bostrom's Superintelligence: Paths, Dangers, Strategies so I think I have reasonable exposure to the arguments that AI could drive humans extinct. But I didn't find any one of the scenarios plausible enough to frighten me. If there is a new strongest argument for why AI is too dangerous, developed since that 2014 book, I'm willing to read it.

Yudkowsky's open letter makes me think that the arguments are not a lot stronger now than in 2014:

A sufficiently intelligent AI won’t stay confined to computers for long. In today’s world you can email DNA strings to laboratories that will produce proteins on demand, allowing an AI initially confined to the internet to build artificial life forms or bootstrap straight to postbiological molecular manufacturing.

Doing anything novel with biology requires experimentation. A million-times-faster thinker won't be able to advance biological research a million times faster. It'll be able to do the experimental design parts faster, and the post-experimental evaluation faster, but it'll have to wait just as long as a human postdoc waits for things to grow and assays to run. It's Amdahl's Law applied to scientific research: the achievable speedup is a modest multiple because even unsleeping geniuses have to wait for experiments.

If I were to interpret Yudkowsky's "bootstrap straight to postbiological molecular manufacturing" in the most charitable way possible, maybe he's using an implausible scenario as a sort of didactic scary story when trying to communicate his fears to the public. But I'd then like to understand and evaluate the actual scenario he's talking about.

0 comments

3 comments · 2 top-level

lukevp3y ago· 1 in thread

Doing something significant with biology requires experimentation for humans. That doesn’t mean it does for every intelligence. It’s quite possible that we have collected enough data in every scientific paper ever written (which an AI could process and understand instantaneously) that an accurate simulation could be run within a computing context. If there are parts of the simulation that are inaccurate, the AI could design physical experiments that are more controlled and isolated to answer any unknowns and make the simulation more accurate with minimal iterations. Assuming this AI is massively more intelligent than us, it isn’t unfeasible that it could find some zero-day exploits to take over all cloud computing to run the simulations, for example, so even the compute unavailability isn’t a great argument as to why it couldn’t do this.

philipkglassOP3y ago

Intelligence is a limited, partial substitute for knowledge. The problem with Aristotelian physics wasn't that Aristotle wasn't smart enough. It was that Aristotle lacked the ground truth knowledge to understand where he was wrong. An AI that can't run experiments will end up with elegant hallucinations of nanofactory blueprints but no actual nanofactory. (That's even assuming that Drexlerian nanotech is actually something that can be built. It's still an open question.)

All the computing power in the world is insufficient to simulate a living bacterium at the molecular level. We also don't have a molecular resolution map of bacteria in the first place. The AI will have to figure out how to make one. The AI can try to develop faster high fidelity approximations for molecular simulation, but that too requires calibration against experiments as well as a lot of simulation resources. If it seizes control of public cloud computing for its simulations people will notice and can just unplug cables. If its intrusions are subtle, only using little unnoticeable compute slices when machines are largely idle, then its simulations are going to get less compute resources in aggregate than researchers have now.

NumberWangMan3y ago

You're right about laser weapons vs laser research. A real problem is that non-lethal AI is very lucrative.

I haven't read the book, so I'm not familiar with which scenarios in particular Bostrom talks about. But I believe I'm versed enough in the issues with AI alignment to carry on an intelligent conversation about it. When you say that none of the scenarios weren't plausible enough, could you give an example as to what you mean?

To give an example from my end, I'm really just imagining a scenario where we give the AGI some sort of goal, such as curing cancer. We've thought about the ways things could go wrong, and used reinforcement learning from human feedback to make an AI that seems to have learned not to do anything that could cause serious harm to a human in training. But it turns out that it actually learned to game our rewards, and just learned not to do anything to harm humans that could be traced back to it. Similarly, we tried to train it not to acquire more resources and only use what we give it, but it just learned to hide its activity. We know reward hacking is extremely common in AIs trained this way. GPT-4 is still willing to provide users with forbidden information -- it hasn't actually internalized the goal of "don't provide people with information that they could likely use to harm themselves or others", it's just some superficially similar goal that we don't really understand.

So anyway, the AI didn't actually learn to care about humanity, only about putting on that appearance, like a sociopath that learned to fake emotions to blend in. It has learned the ability to reason and plan, in order to design virtual experiments to calculate protein folding, etc. We provide it with a plain language description of the problem we want solved, specifying a ton of caveats like:

* Don't take actions that would result in the greater chance of death of people with cancer. * Don't try to detect people with a high likelihood of cancer and kill people before they get it. * Try to come up with a solution to this that most people would be happy with.

We can't specify things like "don't do anything that might result in someone dying" because in previous tests the AI would refuse to try to cure cancer, in case someone who was cured then went on to kill someone else.

No matter how much we specify, it turns out that a sufficiently intelligent, creative AI can come up with a maximal solution. Say, it secretly creates a drug that it distributes through the water supply that makes most people happy with just about anything, then builds a bunch of suspended animation chambers where it stores people who have cancer for ever -- they can be revived, so they're not actually dead. It needs to keep things running forever, so of course it has to take over things in order to build enough of these chambers for everyone who ever lived. Now, that's not really possible, if people keep breeding and reproducing, so it ends up just sticking everyone into these chambers, which they're fine with because of the happy-drug. The earth continues silently through space, with all of humanity protected in suspended animation by our over-zealous AI. Oh, and any life that we mentioned in restrictions is also in suspended animation, while every other bit of life has been exterminated as the AI tried to maximize its resources in order to keep this state of affairs in place forever.

It's a thing where no matter how many restrictions you try to stick on to the problem, a sufficiently smart AI can probably come up with a better solution in one of the loopholes than it can by just doing the boring, human thing of trying to figure out an effective cancer drug that works in 98% of cases. The AI tries to get to 100%, which necessitates a more drastic solution. If you tell it just to come up with a drug with a greater than 90% effectiveness rate, it STILL wants to take over the world in order to acquire enough computing power to guarantee that the effectiveness is no less than 90%. We don't know how to build "satisficers" that are happy with good enough, and even if we could, "satisficers" will tend to build maximizers.

That's exactly what humans (which are kinda satisficers, sometimes) are doing by building AI -- we're trying to find maximal solutions to our problems, except I'm worried that we're not quite smart enough so we're gonna screw it up real bad.

I think we have a chance of building an AI that's aligned close enough not to kill us, as long as it's not that much smarter than us. But the smarter it gets, the more that 2-degree of misalignment matters. And they're already terrifyingly smart.

j / k navigate · click thread line to collapse