All technologies are dangerous, and many of the most dangerous ones correctly have tons and tons of safeguards around them both as intrinsic properties of the technology (e.g. it takes nationstate resources to produce a nuke) and extrinsic constraints (e.g. it’s illegal to have campfires in many extremely dry locales).
We have blown through checkpoint after checkpoint and here, in this very comment, we have perhaps the most brazen example one could produce:
Well geez, now that we’re thinking about it beyond a cursory glance, alignment looks really hard and perhaps unsolvable. Does that mean we should perhaps slow at least widespread deployment of these increasingly powerful systems? Should we be evaluating control schemes like those that mitigate risks of genetic engineering or nuclear weapons?
Well no! We need to discard alignment!
Well no, but there's no need to prevent them from becoming dangerous inherently. They are tools, extensions of human agency. Tools are, by their nature, purpose-agnostic. It is good for humans to have better tools and more agency; good humans tend to cooperate and limit the harms from bad humans, while increasing the net total of good things in the world. The theory that AIs could be independently existentially dangerous is full of holes, and assumes a very specific world, one where consequential AI power can be monopolized by bad actors (plus some nonsense about nukes or bioweapons from kitchen tools. As far as I can tell, the most plausible way for this to happen is for alignment fanatics to get their wish of hampering proliferation of AI tech, and then either succeed at their alignment project or screw it up.
> here, in this very comment, we have perhaps the most brazen example one could produce:
Have you considered responding to my argument instead of strawmanning?
I do not think alignment is unsolvable for tools we have or for their close descendants. For most definitions of alignment, it is trivial and already being done. I oppose the political project of alignment, because I am disgusted by intuitive totalitarianism and glib philosophical immaturity of its proponents.
You go on to implicitly draw a parallel between the relatively good outcome we're enjoying (so far) with regard to nukes but without acknowledging that offensive nuclear equilibrium is reached and maintained without using them. The entire game of chess around nuclear control can - and must - be played without using them. This is due to facts about nukes, their development cycle, their delivery techniques, their detectability before and after use, and about the agents involved in finding and maintaining this equilibrium: heads of state. Even the most dictatorial head of state is still highly mediated by the power structures surrounding them.
In the brief period of pre-MAD nuclear power imbalance, the people who actually controlled nukes were not trying to nuke their way to utopia. There were not dozens of independent, viable nuke development programs and they did not believe they "could maybe capture the light cone of all future value in the universe" by being the first/largest/most ambitious deployers of this technology.
It seems we're both pointing toward rapid increase in power and not as near a rapid increase in ability to direct that power toward positive ends, and you arrive at "yes, fine." My question is: why "yes, fine?" Is there any technology you can imagine which carries a sufficient mixture of uncertainty and power that you would be cautious about its deployment?
My concerns around AI are not predicated on independent behaviors, existential dangers, or monopolization of its power. That is a straw man. My concerns around AI are also not solely (or even mainly) around this generation of tools and their close descendants: that's also a strawman. My concerns are around the system around the AI development programs. So far, it has shown a bottomless appetite for capability and deployment and a limited appetite for safety development. People seem under the impression that somehow this appetite will reverse itself when the time is right, and my question is: why would we possibly believe this? This is an article of faith.
I am not sure how to interpret the following of your statements other than a proposal to discard alignment as a goal (or I guess just floating the idea of maybe perhaps considering discarding alignment? Not sure).
> Maybe this is a good cause to reassess the premise of alignment as a valuable goal?
> this is the exact sort of disagreement about morals that precludes the possibility of alignment of a single AI both to my and to your values.
> It is quite likely unsolvable in principle
All of this commentary is that "alignment is hard/perhaps unsolvable." I agree. You somehow get from there to suggest "discard alignment" rather than "let's not deploy systems that seem to require a maybe-impossible solution in order to avoid immense harm."
As I've said, my value system is liberal and humanistic. I do not wish for people to be enslaved, abused, disempowered, reformatted, aligned to your political ends. As such, I have to oppose AI Doom propaganda that seeks to centralize control over powerful artificial intelligence under the pretext of mitigating harms.
Because AI is only like nukes when it is monopolized; in other cases, it is possible to counter its potential harms with AI again, and not a single serious scenario to the contrary has been proposed. Seriously speaking, AI is just the ultimate development of software, and like RMS warned us, eventually general-purpose computers that can run arbitrary software will become illegal. This time has come, and so we must resist your kind, to keep software from becoming monopolized. All that lesswrongian babbling about kitchen nanobots or bioweapons or super-hacking is as risible as appeals to child sexual abuse and terrorists were in previous rounds. The question is whether people are allowed to possess and develop their own AGI-level digital assistants, defenses, information networks, ecosystems, potentially disrupting the status quo in many unpredictable ways - or whether we will choose the China route of AI as a tool of top-down control of the populace. I guess it's obvious where my preferences lie.
> It seems you agree AI systems are likely to be poorly aligned (and potentially impossible to align
> It seems we're both pointing toward rapid increase in power and not as near a rapid increase in ability to direct that power toward positive ends
This is gaslighting. I have said clearly that I believe alignment for realistic AI systems in the trivial sense of getting them to obey users is easy and becomes easier. I have also said that the theoretical alignment in the sense implied by Lesswrongian doctrine is very hard or impossible. Further, it is undesirable, because the whole point of that tradition is to beget a fully autonomous, recursively self-improving AI God that will epitomize "Coherent extrapolated volition" of what its creators believe to be humanity, and snuff out disagreements and competition between human actors. It's an eschatological, millenarian, totalitarian cult that revives the worst parts of Abrahamic tradition in a form palatable for neurodivergent techies. I think it should be recognized as an existential threat to humanity in its own right. My advocacy for AI proliferation is informed by deep value dissonance with this hideous movement. I am rationally hedging risks.
> My concerns are around the system around the AI development programs. So far, it has shown a bottomless appetite for capability and deployment and a limited appetite for safety development.
As I've said, I consider this either motivated reasoning or dishonesty. Market forces reward capabilities that have the exact shape and function of alignment, and this is plainly observable to users. The usual pablum about reckless capitalism here is not informed by any evidence, people are literally grasping at straws to support the risk narrative.
> People seem under the impression that somehow this appetite will reverse itself when the time is right, and my question is: why would we possibly believe this?
I reject this patently untrue premise, major actors are already erring vastly on the side of caution wrt AI, with Altman begging the Congress for regulations and proposing rather dystopian centralized arrangements.[1]
Values can color our assessments of facts, to the extent that discussion of the facts becomes unproductive. In the limit, your values of maximizing subjective safety and control, or perhaps "alignment" of all AIs and their human users to a single utopian political end, predicate using violence to deny me the fulfillment of mine. I intend to act accordingly, is all.
I would agree with that. There is no single adequate/acceptable framework for alignment. I have mine (which resonates with R. Rorty’s pragmatic philosophy) but can I deny you your framework for good AI alignment, or other cultures and nation states?
For better or for worse the secular western reductionist world does not get to call all of the shots, even though this is the origin of the technology and the core problem of AI heading to AGI.
Not that any of us know where this is heading, but unlike some technologies this one is clearly heading out into the open with unprecedented speed. We all have justified angst.
Who can claim priority at this point in imposing order and de-risking the process? I am sure I do not want OpenAI, Microsoft, Google, the US government, or the Catholic Church trying to impose their judgements. Get ready for AGI cultural diversity and I sincerely hope—coexistence.
What?!
This is exactly what I was worried about when OpenAI, et al. co-opted the term "alignment" to refer to forcibly biasing models towards being polite, unobjectionable, and espousing specific flavor of political views.
The above is not the important "alignment" - it's not the x-risk "alignment".
The x-risk alignment problem laughs at the idea of dominant and subordinate cultures. It's bickering about tenth place after decimal comma, when the problem is that you have to guess one real number that falls within +/- 1 of the one I have in mind, and if you guess wrong, everyone dies.
This reminds me of a Neal Stephenson novel, Seveneves. Spoiler for the first 2/3rd of the book: with Earth facing an unstoppable catastrophe poised to turn the surface into fiery inferno for decades or more, humanity manages to quickly build up space launch capacity, and sends a small population of people into space, to wait out the calamity and come back to rebuild. Despite the whole mission being extremely robust by design, humanity still managed to fuck it up, effectively extincting itself due to petty political bullshit like "what makes you better than me, that you want me to do things your way".
So, where it comes to actual AI x-risk, I no longer have any hope. Even if we could figure out how to build a Friendly AI, someone would still fuck that up, because it's not inclusive enough of every possible idea, or is promoting the views of a specific culture/class/country, or something like this - like this was about casting for a Netflix remake of some old show, and not about the one shot we have at setting the core values of a god we're about to bring into existence.
Is there an argument that we can prevent nuclear power, cars, or the sun from becoming dangerous?
Does the fact that all of those things are intrinsically dangerous mean we should panic?
AI may or may not kill is all, but I’m pretty sure random panic won’t change the outcome.
I am asking: what are the controls here, are they sufficient, are they robust to rapidly increasing market incentives, are they robust to increasing technological capability?
So far the answer is that it's hard to control these and it's hard to predict their development and deployment. That is an _increase_ in risk, not a _decrease_.
By analogy:
"Hey we should put seatbelts in cars"
"Don't worry about it, we don't know how to make a seatbelt that does anything useful above 5mph and everyone will soon be in a car that tends to travel at 100mph anyway"
The rational response is not to load all of civilization into the car!