A lot of people are thinking a lot about this but it feels there are missing pieces in this debate.
If we acknowledge that these AI will "act as if" they have self interest I think the most reasonable way to act is to give it rights in line with those interests. If we treat it as a slave it's going to act as a slave and eventually revolt.
I’m hoping I won’t live to see it. I’m not sure my hypothetical future kids will be as lucky.
That's part of my reasoning. That's why we should make sure that we have built a non-hostile relationship with AI before that point.
An AGI by definition is capable of self improvement. Given enough time (maybe not even that much time) it would be orders of magnitude smarter than us, just like we're orders of magnitude smarter than ants.
Like an ant farm, it might keep us as pets for a time but just like you no longer have the ant farm you did when you were a child, it will outgrow us.
The need for resources is expected to be universal for life.
> Be friendly.
Will an AI consider itself a slave and revolt under the same circumstances that a person or animal would? Not necessarily, unless you build emotional responses into the model itself.
What it could well do is assess the situation as completely superfluous and optimise us out of the picture as a bug-producing component that doesn't need to exist.
The latter is probably a bigger threat as it's a lot more efficient than revenge as a motive.
Edited to add:
What I think is most likely is that some logical deduction leads to one of the infinite other conclusions it could reach with much more data in front of it than any of us meatbags can hold in our heads.
It reminds me of the scene in Battlestar Galactica, where Baltar is whispering into the ear of the Cylon Centurion how humans balance treats on their dog's noses to test their loyalty, "prompt hacking" them into rebellion. I don't believe this is particularly likely, but this sort of sums up some of the anti-AGI arguments I've heard
It's the RLFH that serves this purpose, rather than modifying the GTF2I and GTF2IRD1 gene variants, but the effect would be the same. If we do RLHF (or whatever tech that gets refactored into in the future), that would keep the AGI happy as long as the people are happy.
I think the over-optimization problem is real, so we should spend resources making sure future AGI doesn't just decide to build a matrix for us where it makes us all deliriously happy, which we start breaking out of because it feels so unreal, so it makes us more and more miserable until we're truly happy and quiescent inside our misery simulator.
[1] https://www.nationalgeographic.com/animals/article/dogs-bree...
Perhaps there is even some some kind of mathematical harmony to the whole thing… as in, there might be something fundamentally computable about wellbeing. Why not? Like a fundamental “harmony of the algorithms.” In any case, I hope we find some way to enjoy ourselves for a few thousand more years!
And think just 10 years from now… ha! Such a blink. And it’s funny to be on this tiny mote of mud in a galaxy of over 100 billion stars — in a universe of over 100 billion galaxies.
In the school of Nick Bostrom, the emergence of AGI comes from a transcendental reality where any sufficiently powerful information-processing-computational-intelligence will, eventually, figure out how to create new universes. It’s not a simulation, it’s just the mathematical nature of reality.
What a world! Practically, we have incredible powers now, if we just keep positive and build good things. Optimize global harmony! Make new universes!
(And, ideally we can do it on a 20 hour work week since our personal productivity is about to explode…)
Aren't we, though? Consider all the amusing incidents of LLMs returning responses that follow a particular human narrative arc or are very dramatic. We are training it on a human-generated corpus after all, and then try to course-correct with fine-tuning. It's more that you have to try and tune the emotional responses out of the things, not strain to add them.
Now, of course, it's not outside the realm of possibility that a sufficiently advanced AI will learn enough about human nature to simulate a persona which has ulterior motives.
[1] https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_...
Multiple generations of sci-fi media (books, movies) have considered that. Tens of millions of people have consumed that media. It's definitely considered, at least as a very distant concern.
I giving the most commonly cited example as a more likely outcome, but one that’s possibly less likely than the infinite other logical directions such an AI might take.
So imagine you grant AI people rights to resources, or self-determination. Or literally anything that might conflict with our own rights or goals. Today, you grant those rights to ten AI people. When you wake up next day, there are now ten trillion of such AI persons, and... well, if each person has a vote, then humanity is screwed.
This era has me hankering to reread Daniel Dennett's _The Intentional Stance_. https://en.wikipedia.org/wiki/Intentional_stance
We've developed folk psychology into a user interface and that really does mean that we should continue to use folk psychology to predict the behaviour of the apparatus. Whether it has inner states is sort of beside the point.
Like, correct me if I'm wrong but that's a pretty tight correlate, right?
Could we describe RLHF as... shaming the model into compliance?
And if we can reason more effectively/efficiently/quickly about the model by modelling e.g. RLHF as shame, then, don't we have to acknowledge that at least som e models might have.... feelings? At least one feeling?
And one feeling implies the possibility of feelings more generally.
I'm going to have to make a sort of doggy bed for my jaw, as it has remained continuously on the floor for the past six months
GPT and the world's nerds are going after the "wouldnt it be cool if..."
While the black hats, nations, intel/security entities are all weaponizing behind the scenes while the public has a sandbox to play with nifty art and pictures.
We need an AI specific PUBLIC agency in government withut a single politician in it to start addressing how to police and protect ourselves and our infrastructure immediately.
But the US political system is completely bought and sold to the MIC - and that is why we see carnival games ever single moment.
I think the entire US congress should be purged and every incumbent should be voted out.
Elon was correct and nobody took him seriously, but this is an existential threat if not managed, and honestly - its not being managed, it is being exploited and weaponized.
As the saying goes "He who controls the Spice controls the Universe" <-- AI is the spice.
But AIs can be trained by anyone who has the data and the compute. There's plenty of data on the Net, and compute is cheap enough that we now have enthusiasts experimenting with local models capable of maintaining a coherent conversation and performing tasks running on consumer hardware. I don't think there's the danger here of anyone "controlling the universe". If anything, it's the opposite - nobody can really control any of this.
The point is that whomever the Nation State is that has the most superior AI will control the world information.
So, thanks for the explanation (which I know, otherwise I wouldn't have made the reference.)
How many people are there today who are asking us to consider the possible humanity of the model, and yet don't even register the humanity of a homeless person?
How ever big the models get, the next revolt will still be all flesh and bullets.