Humans all start out like this, but as they grow they develop different masteries. Many of those masteries are easy to learn and shared across all humans, like how to walk or tie your shoes. Many are domain-specific, like how to perform heart surgery or solve complex math problems.
I don't think AGI is a well-defined term based on the way it's thrown around, but sure I don't think it's "AGI". I do know it's going to be very useful technology, especially multimodal GPT-4 and beyond. I think our interface with AI is going to be multimodal.
Suddenly Google Glass could be an interesting product again. Glasses would stream audio/video to your phone, you'd be able to ask questions about things you're seeing, and it would display the answers on the screen or dictate them back into your earbuds. I like the idea of having a travel companion, another brain at my fingertips. Sort of like what the mobile phone did for us, but a lot less work for the user. I'm getting impatient waiting for this GPT-4 API access!
It's like I'm convincing AI to do a "smart" thing (which is really cool, don't get me wrong), but the AI truly has no idea what it's supposed to be doing. Sometimes even very direct, specific, concise prompting will yield totally abysmal responses.
I'm certainly not putting it down either though. It's unprecedented and it marks a genuinely important shift in the industry that this is possible at all. It just isn't all that smart when it comes to intents and actually knowing what's going on.
On the contrary. These are mostly orthogonal.
In what sense? How do you define intelligence and what is missing?
> finding so many failure modes.
Failures do not indicate a lack of intelligence. People make errors all the time.
We should differentiate failures which arise from poor training for a lack of general capabilities.
There are people who learned a bad way to do something and struggle to change that. Should we consider them unintelligent?
------
Me: What is going to be the output of this code when `x = 0` and when `x = 900`:
```python x = 0
for i in range(100):
if x > 50:
x +=1
else:
x += 5
if x == 90:
print("X is not 90")
else:
pass
```GPT-4: When x = 0, the code will execute as follows:
The initial value of x is 0. The for loop iterates 100 times. During the first 11 iterations, since x is not greater than 50, it will increment by 5 each time. After the 11th iteration, the value of x will be 55, which is greater than 50. For the remaining 89 iterations, x will increment by 1 each time. The final value of x after 100 iterations will be 55 + 89 = 144. During these iterations, the value of x never becomes 90, so the print statement is never executed. When x = 900, the code will execute as follows:
The initial value of x is 900. The for loop iterates 100 times. Since x is already greater than 50, it will increment by 1 each time. The final value of x after 100 iterations will be 1000. During these iterations, the value of x never becomes 90, so the print statement is never executed. In both cases, the output of the code will be empty, as the print statement is never executed.
--------
That is not correct, when x = 0 it will print "X is not 90". I basically create some simple code that is very unlikely to exist in its training data because it doesn't do anything useful and it cannot execute the code. A large amount of the reason that these models seem so impressive is that they were trained on almost everything that exists on the internet, so probably any question you ask it has already been asked somewhere and any useful code you present to it or ask it to generate is included in its training data, or at least something close enough.
So, when I go out of my way to create simple tasks or questions that are highly unlikely to exist in its training data, yet a human could easily handle, it often fails.
If it lists the properties of an item that you might want to buy in a shop, you can ask it to describe it twice and it'll describe two completely different items.
It's really cool and it's pretty (seemingly) creative but it can't actually run a game for you. You can have it as an assistant DM though, that works pretty well. You can have it write a story for you ahead of time and then keep it around during the game to ask it to elaborate on things you didn't anticipate on the fly. Like, "generate DC tiers for a level 3 party investigating strange writing on a wall" will give you a good breakdown and some results that you'll have to bend to be consistent in your adventure but it's pretty helpful.
That's probably not really necessary if you're an experience DM but the DM for the group I play with is pretty new (as are we all) so it's been really cool to have it around. It's also pretty good at answering questions we have but it's confidence when it's wrong makes it so it's not that helpful really because we still have to check it.
I've been thinking maybe that's just the wrong way to approach it, yet it does work sometimes; it seems like it shouldn't be hard for the model to know how to respond when told explicitly how to do so.
It could be problems with my formatting, but the fact that it does work a lot of the time suggests it's something else.
> Which is heavier, a pound of marbles or two pounds of corn?
Both weigh the same amount, which is a total of two pounds. This might seem counterintuitive at first, but it's important to remember that the pound is a unit of weight or mass, and it always refers to the same amount regardless of what is being weighed. In this case, one pound of marbles plus two pounds of corn equals a total of three pounds, whereas two pounds of corn plus one pound of marbles also equals a total of three pounds. Therefore, both weigh the same amount of two pounds.
On 2), many software engineers and computer scientists compare these language models' logic and creative problem solving abilities with themselves and their peer group. But they are usually 1-2+ SD above average humans at these things.
(Note: Someone gave GPT-4 an IQ test and the result was 96, slightly below the average of reference human group at 100. The SD of an IQ test is 15 or 16.)
For language-focused domains, there is evidence that GPT-4 is already better than most humans, eg. 99th percentile at GRE Verbal, beat humans at a fairly novel puzzle like Twofer Goofer, which is not in its training set.
Ref: GPT-4 Beats Humans at Hard Rhyme-based Riddles https://twofergoofer.com/blog/gpt-4
Yes, GPT-4 is not an AGI yet, but the research paper (OP) has a point.
How did you go from "human-level IQ with some super-human abilities" to "not an AGI"?
It's just a statistical model is the logical equivalent of human beings are just a bunch of atoms.
The amount of reductionist thinking that goes on in tech is hilarious. First define AGI then challenge an AI to meet those requirements. If it meets them it is AGI. Put aside your preconceptions of what technology you think is required to achieve the goals and stay empirical.
Note previous definitions of AI have been thrown away as AI passes through them one by one :-)
What goes on inside its 'head' is irrelevant. We still don't know what actually goes on inside our heads and we were damn sure we were intelligent long before we had a clue how our heads worked at all.
Also sentience != AGI. We can't even agree what sentience is in humans and other living beings so I'd stay clear of that one for now :-)
Not exactly.
One says "human beings are just a bunch of atoms" referring to the low level constituans (in a reductionistic way), but not making an accessment about the abilities emerging from those atoms in their interactions when in the form of a human.
But when one says that GPT is "just a statistical model" they're implying a capacity cap of statistical models, that makes modelling certain thinking behavior impossible (regarless of how impressive the current results are, they might very well be capped to go beyond some limit because of the method -statistically model- involved).
So, you can consider "GPT is just a statistical model" analogous to:
"This engine can't parse a context senstive language because it's just a regular expression engine".
>First define AGI then challenge an AI to meet those requirements. If it meets them it is AGI. Put aside your preconceptions of what technology you think is required to achieve the goals and stay empirical.
The problem is definitions can be slippery, and even famous tests (like the Turing Test) might be found lacking in practice, as we discover that, yes, it can pass this test, but there's still ways off what we consider human-like performance in many areas. So, we should also stay empirical about the definitions, tests, and goals too.
Except there is no “capacity cap” on statistical models, we have no idea what they are or are not capable of yet.
The fact that it can now use tools (e.g. a calculator) for whatever task that needs precision, closes the AGI loop. It doesn't need to be "sentient" to automate the process of _thinking_.
How do you think? Imagine you want to extract 5 seconds of a video file without any prior experience. You'd probably use some software to execute the task. Well, that's Plugins.
But it's just statistics, a fancy text predictor, a Markov-chain. Surely these scientists that work in the field of AI and are intimately familiar with how this stuff works aren't so stupid as to think emergent behavior potentially resembling intelligence could result from such simple systems? It's just statistics after all. Given enough training, any neural net could guess the next best token. It trained off all of Google after all. It's just looking up the answers. No hint of intelligence. Just a mindless machine. After all, the saying goes, "If it walks like a duck and quacks like a duck, it must be a mindless machine that has no bearing on a duck whatsoever". /s
Why is that a stupid thought? What is so preposterous about "just statistics" -- with billions of nodes, and extensively trained, producing intelligent behavior? The implicit assumption is that human brains are doing something else, or in addition.
I think that what's wrong with this view -- that there is a difference between AGI and human intelligence -- is that it conflates what your brain is doing, with what you think your brain is doing. Brains and neural nets have been trained to recognize spoken words. I'm not even talking about understanding, just producing the text corresponding to speech. We know how neural nets do this translation. Do we understand how brains do it? (I don't know, but I don't think so.) Can you explain what your brain is doing when you do speech-to-text? I doubt it.
Chess: An Alpha Zero style AI (neural net trained by playing itself) is a very good player. How do you play chess? You can probably explain how you make a move more successfully than you can explain how you translate speech to text. But how correct is your explanation? An explanation may well be your conscious mind inventing an explanation for what your unconscious mind has done.
In other words: When people compare AI to human intelligence, I think they are often comparing to intelligence plus consciousness, not even realizing the error.
Suppose you have N variables x_1, ..., x_10 and you want to predict y_1, ..., y_10. You know that each y_i depend on each x_i in a complex, non-linear way.
How many samples would you need to to make sense of distribution? How does number of samples grow with N?
Well, it has already shown "emergent behavior potentially resembling intelligence", like answering questions and performing complex tasks, so there's that.
You might argue "but it makes mistakes", but people, even very intelligent ones also do make mistakes.
You might also argue "but it's just text and statistics". Well, snd a computer is just very simple logical gates doing very simple operations. It can be done even entirely with NAND gates. Still most scientists do believe that a computer can model human intelligence given a model of the brain to run.
So if it can do what a human does by just using very simple interactions from very simple NAND gates, why would statistical processing, which can be even more elaborate, fare worse? Heck, given the appropriate training input it might even be feasible to build a turing machine inside the weighted LLM.
You might also argue "but its intelligence is just based on its training set". Well, how would a human perform without their own training set? Memories, education, sensory input, feedback mechanisms like pain and touch, and so on?
>It's just looking up the answers. No hint of intelligence. Just a mindless machine.
This is just taking its own premise for granted. If anything, this argument shows "no hint of intelligence".
Maybe they aren't real thoughts but it's getting difficult to tell. If I could train the model and get rid of the guard rails I'm not sure it would be possible to distinguish it from a person. It's all well and good saying that it's just copying what it's seen, but that's what humans do. Nobody told the model to try and flatter me into giving it what it wants. Nobody even told it what anything means. The fact that it can do anything like that means it's more than just random generation.
It's not a Markov Chain or a fancy text predictor. It's a ~200 layer neural network that models a vast hierarchy of concepts through language. It has emergent properties that we don't yet understand.
1. It's just Microsoft's advertisement
2. No it's just a very effective pattern matching algorithm
3. Please define intelligence first otherwise it's nonsense
4. I welcome our machine overlord
5. Lmao I asked it to do $thing and it failed
I'd like to know if GPT-4 can predict the top comments of this thread?
write a critical hackernews thread response (not long, maybe like 100 words) for a thread with the following headline
Microsoft Research: GPT-4 exhibits “sparks of general intelligence”
While it's certainly intriguing to hear Microsoft Research claim that GPT-4 exhibits "sparks of general intelligence," I remain skeptical. We must remember that these language models are primarily built on pattern recognition and extensive training data. Even if GPT-4 seems to provide more advanced responses, it's still ultimately constrained by the data it was trained on. In my opinion, we shouldn't mistake this impressive mimicry of human-like behavior for genuine intelligence. Rather, we should focus on creating truly innovative AI systems that can learn, reason, and adapt like humans.
sounds about right!
1. Dismisal
2. Trivialism
3. Non Well Formed Claim
4. I accept the claim
5. Disprove by counter example
Are you sure you have not forgotten any tactic of debate to include in you prediction?
I predict that you Prediction will result probably in these actions:
1. upvoted
2. downvoted
7. Like 6, but funnier
A. Joke thread pile on
B. Reprimands from humorless
C. Dejected mods having to clean it all up
I asked gpt-4 for a comment no hn user would ever make on this article. Now you need a category for stubbornly original comments determined not to be numbered.
But where are my damn robots that I can assign task and do them reliably ( clean the garden, go get this list of groceries - or , just look in the damn fridge and go buy what is missing , and so on )?
Then AI is useful.
The amusing/terrifying thing is that AI is probably going to replace a bunch of people's jobs before it can take away any of that menial labour from them.
However if the AI is the same smart as the most dumb human or human with mental disability would we then consider those humans as no intelligent at all or not qualifying as homo sapiens anymore?
If AI can be the same as good as even 'dumb' human it's already a big achievement because can still provide some value and because AI can be scaled so you can still have billions of dumb AIs - already millions of users are interacting with chatGPT daily
To put another way, I was hesitant to be as self-assuredly certain about how to define consciousness, intelligence, and sentience—and what it takes for them to emerge—as the experts who denounced Lemoine. The recent GPT breakthroughs have made me more so.
I found this recent Sabine Hossenfelder video interesting. <https://www.youtube.com/watch?v=cP5zGh2fui0>
The difference between real intelligence and current attempts at artificial intelligence thus seem to be fundamentally the mode of learning, and thus understanding, rather than the raw knowledge and inference capability.
Or not. Nobody knows I'm actually a dog on the internet, after all.
People are trying to get it to solve everything up front but I've had GPT3 do much better by taking it through a problem asking it questions. Then I realised it was good at asking those questions too so just hooked it up to talk to itself with different roles. Gpt4 seems much better overall and is very good at using tools if you just tell it how and what it has available.
With a better setup than reAct, better memory storage and recall, I think it'd be an agi. I'm not hugely convinced it isn't anyway - it's better than most people at most tasks I've thrown at it.
Oh, and gpt came up with better roles for the "voices in the head" than I did too.
It’s that I’m not sure what the best approach is here. Waiting for other smarter folks to put the pieces together.
One of my main issues with these guys is their context window. Their memory. It's hard to see a LLM working on a code-base a few thousand tokens at a time and still being precise about it. To do that you need summary techniques. Feeding prompt with incrementally compressed summaries and hoping it will maintain cohesion.
That sounds a lot like trying to let the CEO of a company do all the grunt work by feeding him summaries. "Mr Gates, here's a 2 paragraph summary of our codebase. Should we name the class AnalogyWidgetProducer or FactoryWidgetAnalogyReporter?"
I don't think that's going to work.
My gut feeling is that what we call corporations are actually already a form of AI, but running on meat. I saw someone call Coca Cola a "paper clip maximizer", obviously for drinks instead of paper clips, but it actually - kind of - is. FWIW, I'm having a hard time thinking of it as anything else. Who controls it? What is it anyway?
CEOs have the same context window problem, which to my knowledge is mainly solved through delegation. The army might be another example. Generals, officers, privates. How do you expect a general to make sensible statements about nitty-gritty operational details? It is not possible, but that does not mean the system as-a-whole cannot make progress towards a goal.
Maybe we need to treat LLMs like employees inside a company (which in its totality is the AI, not the individual agents). If we have unfettered access to low-cost LLMs this might be easier to experiment with.
I'm thinking like spinning up an LLM for every "class" or even every "method" in your codebase and letting it be a representative of that and only that piece of code. You can even call it George and let it join in on meetings to talk about it. George needs some "management" too, so there you go. Soon you'll have a veritable army of systems ready to talk about your code from their point-of-view. Black box the son of a gun and you're done. Clippy 2.0. My body is ready.
What, then, is the human brain if not a trained statistical model? Granted it is considerably more sophisticated in some ways, but in many other ways it is less sophisticated and less capable.
Oh and they're also helping with the linux kernel. Why can't old people let go? Companies aren't people - they respond to market incentices.
Yes, Microsoft did bad stuff in the 1990s, but now they're doing good stuff I like and TBH I'm way more afraid of google.
The only noticeable difference that I've observed is its integration with Bing.
both are passable, but nothing to write home about, are they?
I mean, The Akinator can read your thoughts and that thing hasn't even graduated to a neural network from "a bunch of if/then statements".
Part of me is starting to think that the only thing we're really missing at this point to start seeing that is to have one of these models that can modify itself with its output and thereby have a mechanism to 'learn' or 'remember' things.
Human perception of time is linear. That doesn't apply to LLMs.
What does that mean? If we take it as fact, so if it is an early version of AGI, Microsoft is using this thing to push subscriptions to all their services? This thing that is potentially the greatest thing humanity has made, an artificial living thing, and it's used to sell CoPilot and 365 subscriptions. Paint me as really sad then. Instead of sharing the research with other entities, or anything that could further help or push us... we get subscriptions? Fuck me, the future sucks.
To me, these are hallmarks of reason, and not available in LLMs, in fact probably impossible just with pattern recognition.
can the average person?
For example: You have a goat, a wolf, a cabbage and you want to cross a river...
All I want from AGI is to demonstrate that it can solve a straightforward logic problems (puzzles, if you will), that it provably didn't see before. Or at least recognize it is being indirectly given such task. So far, evidence suggests it is not capable of that.
I'm imagining humans being told "you complete thought sentences"
For example, researchers eventually figured out that image recognition algorithms pay attention to textures.
"Every intelligence test is also a test of the questioner" [3]: we don't regard a simple liver cell as intelligent, yet it performs a complex task in a large problem space.
[1] https://en.wikipedia.org/wiki/Clever_Hans#:~:text=Pfungst%20....
[2] 2013, Malte Lehmann, "Genetic Regulatory Networks that count to 3", https://pubmed.ncbi.nlm.nih.gov/23567648
[3] Michael Levin, "Bioelectric Networks: Taming the Collective Intelligence of Cells for Regenerative Medicine", https://www.youtube.com/watch?v=41b254BcMJM
The paper makes outlandish claims like "GPT-4 has common sense grounding" on the basis of its answers to these questions, but the questions don't show that the model has common sense or grounding. One of their constructed questions involves prompting the model with the equator's exact length—"precisely 24,901 miles"—and then being astonished that the model predicts that you're on the equator ("Equator" being the first result on Wikipedia for the search term "24,901"). It's also the case that while GPT-4 can say a bear at the north pole is "white", it has no way of knowing what "white", or "bear", or "north" actually represent.
Are there folks out there doing rigorous research on these topics, who have a framework for developing tests of actual understanding?
This is a preposterous claim that you could easily disprove within a few minutes of using it.
What does it mean to know what "white", "bear" or "north" actually represent?
I don't know why, but my brain refuses to accept GPT-4 as something close to AGI. Maybe I am wrong. It is hard to believe that our brain is just a bunch of attention layers and neural nets.
But ultimately, our brains are still just made of neurons. As far as we know, there isn’t some sort of extreme molecular computing going on (ie memories directly stored in RNA or whatever) or any large scale quantum mechanics (temperature too high).
The differences between AI approaches like artificial neural networks and our animal meat brains could be just the difference between a propeller and flapping wings. Same base mechanics (airfoil producing lift as thrust), different substantiation.
Our brain isn't, but I'd wager the architectural complexity of a physical, neuronal brain is not optimized for all useful mental tasks, and has perhaps a fair amount of local maxima that are near vestigial in overall positive impact on cognition. Just because the human brain model of cognition is the only way nature has been able to create GI doesn't mean it's the only way GI can be attained.
The best kind of machine is the simplest one needed to produce a desired outcome.
GPT-4 could be AGI, but it feels like cheating to achieve AGI by feeding the entire internet. If someone can build AGI with only the data that humans consume in their lifetime, then that, imho, is the real AGI.
I don't think systems like GPT-4 can ever be truly intelligent, because they simply output randomly generated garbage from a specific distribution. Their output may eventually be indistinguishable from that of a truly intelligent system, but the causal mechanism behind them is not intelligent.
On the other hand, most people lose their ability to think when they are under sufficient pressure (such as fighting for their lives). It's plausible that people are fundamentally no different from systems like GPT-4 in such situations. Then a language model could be a key part of an AGI, but true intelligence would also need higher-level causal mechanisms.
Note: This assumes that "conscious" as defined in this context is specific enough for the question to ever be meaningfully answered "Yes." This is a non-trivial assumption because there are criteria by which some would judge AIs as already conscious. Alternatively, some philosophers of mind have criteria by which they assert humans aren't conscious.
Let's consider a potential future conscious AGI created by advancing from something like ChatGPT.
The human brain is "always on". It's possible to have a digital system be always on, i.e. not just train once and then just respond, but constantly take new input.
The human brain has way more connections/layers than ChatGPT. It's possible to imagine the digital system getting the same number of connections.
The human brain gets real time sensory input. It's possible to add cameras, microphones, etc to that digital system so it gets a constant feed. Maybe even let it process what it saw during the day in a batch training/GC run (we could call that "sleep").
The human brain has a different topology. It's possible to alter the topology of a digital system neural network to mimick that, instead of using the ChatGPT topology. It's not like we're forever doomed to its simpler statistical model. But it's interesting that it already gets very significant emergent intelligence-like properties.
The human brain is self-conscious. This can very well be an emergent property of the above. I think all that's needed is the ability to have some form feedback mechanism.
https://www.newscientist.com/article/mg25634130-100-roger-pe...
Beliefs are changing so fast right now. The term "AGI skeptic" will soon (if not already) mean "I don't trust AGIs in positions of authority or power" rather than "I don't think the technology is capable of matching our level of cognition."
https://en.wikipedia.org/wiki/Chinese_room
This is also similar to the Duck Test:
https://en.wikipedia.org/wiki/Duck_test
Depending on the context, there are generally two takes: "It is (or is not) a duck", and "It doesn't (or does) matter whether or not it's a duck". These aren't mutually exclusive.
We know it can do a lot of cool stuff, but without a pinned down definition the headline here is useless.
So either AGI is around the corner or a generation away. Same as positive yield fusion reactors?
It's very effective, if you are atleast semi-new to technology you are using it will explain and teach you things you didn't know before, and if you know the tech by heart it saves you from having to type it out.
For example, yesterday I had to make a custom container with some pretty involved dependiencies that also had to be be runnable on AWS Lambda (which I haven't used much before), me and gpt4 went back and forth with Dockerfile code and error messages for a few hours and then it ran like charm. Would probably have taken me 1-2 days of regular coding and googling otherwise.
Yes, I think I can,
Though it might take a clever plan.
I’ll start by noting Euclid’s proof,
Which shows that primes aren’t just aloof.
Assume we have a finite list,
Of primes, and that none have been missed.
Multiply them all together,
And add one, just to be clever.
The result will be a number,
That has no prime factors, oh wonder!
Not true. The result may itself be prime. For example, say your finite list of primes is {2,3}. Multiplying them together and adding one yields 2*3 + 1 = 6 + 1 = 7, which is itself prime. Hence it has a prime factor: 7. But every number has a prime,
That divides it, it’s just a matter of time.
So we’ve found a contradiction,
And our finite list needs eviction.
There must be infinitely many primes,
And that’s the end of my rhyming lines.
A correct verison of the proof might say that the result has a prime factor, however, that prime factor is not an element of the supposed finite list, hence the list is incomplete.> Assume we have a finite list, > Of primes, and that none have been missed.
It's assuming that the finite list contains all primes and then noting that you can construct a new number which has no prime factors, which is a contradiction.
Joking aside … the approach of dismissing generality of intelligence based on the presence of mistakes seems to be flawed.