Sparks of Artificial General Intelligence: Early Experiments with GPT-4 (opens in new tab)

(arxiv.org)

180 pointsthinxer3y ago236 comments

236 comments

152 comments · 33 top-level

outlace3y ago· 20 in thread

ChatGPT and its relatives are very very impressive on first impressions, but I've been using ChatGPT-3 and now 4 heavily every day since they became available to individuals and once you start using them this much it becomes very clear how NOT intelligent they are. It really just seems like extremely impressive statistical inference after this much use and finding so many failure modes. But it is still impressive how much of human intellectual endeavors can be captured by sophisticated statistical inference. Very useful technology nonetheless.

throwaway48373y ago

Simple thought experiment: if I was a high school teacher and graded hundreds of exams, papers, and homework every month for years, would I develop the same opinion of my C-level students? The ones who have somewhat correct work, often inconsistent, clear that they're not thinking about the material, etc.

Humans all start out like this, but as they grow they develop different masteries. Many of those masteries are easy to learn and shared across all humans, like how to walk or tie your shoes. Many are domain-specific, like how to perform heart surgery or solve complex math problems.

I don't think AGI is a well-defined term based on the way it's thrown around, but sure I don't think it's "AGI". I do know it's going to be very useful technology, especially multimodal GPT-4 and beyond. I think our interface with AI is going to be multimodal.

Suddenly Google Glass could be an interesting product again. Glasses would stream audio/video to your phone, you'd be able to ask questions about things you're seeing, and it would display the answers on the screen or dictate them back into your earbuds. I like the idea of having a travel companion, another brain at my fingertips. Sort of like what the mobile phone did for us, but a lot less work for the user. I'm getting impatient waiting for this GPT-4 API access!

euroderf3y ago

So is Apple negotiating with some LLM mfr to integrate with realityOS ?

1 more reply

steve_adams_863y ago

This has been my take as well. It's without a doubt at the point where it's useful and I'm very interested to experiment with it and see how much utility I can get out of it. At the same time, as much as I realize skill with prompting is essential, a lot of my time is spent programming automations around using those prompts. A lot of thought and care goes into it, and the failure rates of what I'm building are dramatically higher than anything I build without AI.

It's like I'm convincing AI to do a "smart" thing (which is really cool, don't get me wrong), but the AI truly has no idea what it's supposed to be doing. Sometimes even very direct, specific, concise prompting will yield totally abysmal responses.

I'm certainly not putting it down either though. It's unprecedented and it marks a genuinely important shift in the industry that this is possible at all. It just isn't all that smart when it comes to intents and actually knowing what's going on.

waboremo3y ago

That is pretty much the case, but I'm always taken back by how many people believe intelligence = directly replicating human thought. I thought the common consensus was instead that (artificial) intelligence was instead about mimicking enough of the process to provide the outcome. Largely because it's impossible to replicate something when we don't entirely know how it works. Major discoveries of basic aspects of the mind are still happening regularly.

textninja3y ago

> Largely because it's impossible to replicate something when we don't entirely know how it works.

On the contrary. These are mostly orthogonal.

1 more reply

killerstorm3y ago

> how NOT intelligent they are

In what sense? How do you define intelligence and what is missing?

> finding so many failure modes.

Failures do not indicate a lack of intelligence. People make errors all the time.

We should differentiate failures which arise from poor training for a lack of general capabilities.

There are people who learned a bad way to do something and struggle to change that. Should we consider them unintelligent?

outlace3y ago

Here's an example of it failing to understand some very simple Python code that any human with basic programming knowledge would not mess up:

------

Me: What is going to be the output of this code when `x = 0` and when `x = 900`:

```python x = 0

for i in range(100):

    if x > 50:

        x +=1

    else:

        x += 5

    if x == 90:

        print("X is not 90")

    else:

        pass

```

GPT-4: When x = 0, the code will execute as follows:

The initial value of x is 0. The for loop iterates 100 times. During the first 11 iterations, since x is not greater than 50, it will increment by 5 each time. After the 11th iteration, the value of x will be 55, which is greater than 50. For the remaining 89 iterations, x will increment by 1 each time. The final value of x after 100 iterations will be 55 + 89 = 144. During these iterations, the value of x never becomes 90, so the print statement is never executed. When x = 900, the code will execute as follows:

The initial value of x is 900. The for loop iterates 100 times. Since x is already greater than 50, it will increment by 1 each time. The final value of x after 100 iterations will be 1000. During these iterations, the value of x never becomes 90, so the print statement is never executed. In both cases, the output of the code will be empty, as the print statement is never executed.

--------

That is not correct, when x = 0 it will print "X is not 90". I basically create some simple code that is very unlikely to exist in its training data because it doesn't do anything useful and it cannot execute the code. A large amount of the reason that these models seem so impressive is that they were trained on almost everything that exists on the internet, so probably any question you ask it has already been asked somewhere and any useful code you present to it or ask it to generate is included in its training data, or at least something close enough.

So, when I go out of my way to create simple tasks or questions that are highly unlikely to exist in its training data, yet a human could easily handle, it often fails.

2 more replies

galleywest2003y ago

We are talking about a computer program that cannot make decisions. Often times human intelligence is compared with the types of decisions the person makes.

1 more reply

maxdoop3y ago

Our of curiosity, what is GPT-4 getting wrong so often? It’s prettily wild to my own , admittedly easily impressed, mind.

parineum3y ago

I've been trying to make it be a DM for a while now. It works seemingly well for a while but it's making everything up on the fly, which is fine as long as there's not state that needs to be maintained/remembered. Once you start trying to get it to do combat, it seems like it's working pretty well but eventually you realize that it's just making shit up behind the scenes. It'll tell you there are 4 goblins ambushing you and then you can ask it how far away they all are and it'll list three. You ask what happened to the fourth one and it'll say I'm sorry, there were only 3. I was mistaken before.

If it lists the properties of an item that you might want to buy in a shop, you can ask it to describe it twice and it'll describe two completely different items.

It's really cool and it's pretty (seemingly) creative but it can't actually run a game for you. You can have it as an assistant DM though, that works pretty well. You can have it write a story for you ahead of time and then keep it around during the game to ask it to elaborate on things you didn't anticipate on the fly. Like, "generate DC tiers for a level 3 party investigating strange writing on a wall" will give you a good breakdown and some results that you'll have to bend to be consistent in your adventure but it's pretty helpful.

That's probably not really necessary if you're an experience DM but the DM for the group I play with is pretty new (as are we all) so it's been really cool to have it around. It's also pretty good at answering questions we have but it's confidence when it's wrong makes it so it's not that helpful really because we still have to check it.

5 more replies

steve_adams_863y ago

In my experience it has been failing to adhere to response formats. I can tell it to respond with answers in one of two specific formats, but the success rate is sometimes very low depending on the content of the inquiry it's supposed to respond to. Not sure why, it just seems to lose track of the response format.

I've been thinking maybe that's just the wrong way to approach it, yet it does work sometimes; it seems like it shouldn't be hard for the model to know how to respond when told explicitly how to do so.

It could be problems with my formatting, but the fact that it does work a lot of the time suggests it's something else.

tedunangst3y ago

Any variant of a "surprising" logic puzzle forces it to latch onto the surprising answer. Like whether two pounds of iron weighs more than one pound of feathers. Or any objects. It "expects" the twist, and always answers accordingly. It does so even if you change up the objects to be less tricky.

> Which is heavier, a pound of marbles or two pounds of corn?

Both weigh the same amount, which is a total of two pounds. This might seem counterintuitive at first, but it's important to remember that the pound is a unit of weight or mass, and it always refers to the same amount regardless of what is being weighed. In this case, one pound of marbles plus two pounds of corn equals a total of three pounds, whereas two pounds of corn plus one pound of marbles also equals a total of three pounds. Therefore, both weigh the same amount of two pounds.

3 more replies

yosito3y ago

GPT is really good at repeating what the average intelligent response to something might look like, but it doesn't seem to be actually reasoning about any of its responses. Give it a complex logical problem that it needs to deduce from inputs, such as which foods contain gluten, based on their ingredient lists, and it will reliably fail. As a person with celiac, this is a task I complete multiple times a day with no effort. Just today I was trying to build a prompt that would summarize daily news updates leaving out anything about Russia, but it still included Russia more often than not despite being very clear in the prompt that anything about Russia should not being included in the response under any circumstances.

1 more reply

bitL3y ago

LLMs without any online storage can be at best convincing liars. Combining them together with an actual retrieval/QA system (e.g. by first fetching exact answer via a QA model and then reformulating output via GPT) could start feeling pretty real quickly.

nopinsight3y ago

It depends on 1) the domains 2) your comparison group.

On 2), many software engineers and computer scientists compare these language models' logic and creative problem solving abilities with themselves and their peer group. But they are usually 1-2+ SD above average humans at these things.

(Note: Someone gave GPT-4 an IQ test and the result was 96, slightly below the average of reference human group at 100. The SD of an IQ test is 15 or 16.)

For language-focused domains, there is evidence that GPT-4 is already better than most humans, eg. 99th percentile at GRE Verbal, beat humans at a fairly novel puzzle like Twofer Goofer, which is not in its training set.

Ref: GPT-4 Beats Humans at Hard Rhyme-based Riddles https://twofergoofer.com/blog/gpt-4

Yes, GPT-4 is not an AGI yet, but the research paper (OP) has a point.

cjbprime3y ago

> Yes, GPT-4 is not an AGI yet, but the research paper (OP) has a point.

How did you go from "human-level IQ with some super-human abilities" to "not an AGI"?

2 more replies

gwern3y ago

Those rhyme riddles are pretty impressive. It may not truly understand rhymes due to BPEs, but I guess it can go a long way with an immense vocab, perfect recall, and memorization of similar-sounding words to beat ordinary human players who aren't scoring 800 SAT-Vs...

chrischen3y ago

It’s almost as if someone who can piece together words well enough they can make people think they are smarter than they actually are.

ggm3y ago

It's worked for me these last 60 years. I think P T Barnum had something to say about that.

random_cynic3y ago

No one cares about your anecdotes and feelings. These people conducted thousands of systematic tests for at least six months. If you want to be taken seriously then do the same.

georgehill3y ago· 17 in thread

> Given the breadth and depth of GPT-4’s capabilities, we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system.

I don't know why, but my brain refuses to accept GPT-4 as something close to AGI. Maybe I am wrong. It is hard to believe that our brain is just a bunch of attention layers and neural nets.

Robotbeat3y ago

Well it’s not quite that simple. Brains use spiking neural networks, not the kind used typically in artificial neural networks like those used by LLMs. The “weights” can be changed over time, new connections and even new neurons formed. And the number of connections (“weights”) is about 500-1000x more in our brain than GPT-3. The connection topography is a lot different.

But ultimately, our brains are still just made of neurons. As far as we know, there isn’t some sort of extreme molecular computing going on (ie memories directly stored in RNA or whatever) or any large scale quantum mechanics (temperature too high).

The differences between AI approaches like artificial neural networks and our animal meat brains could be just the difference between a propeller and flapping wings. Same base mechanics (airfoil producing lift as thrust), different substantiation.

thro13y ago

Do you consider that every neuron in the brain has unique DNA and ancestorship ?

1 more reply

atleastoptimal3y ago

> It is hard to believe that our brain is just a bunch of attention layers and neural nets.

Our brain isn't, but I'd wager the architectural complexity of a physical, neuronal brain is not optimized for all useful mental tasks, and has perhaps a fair amount of local maxima that are near vestigial in overall positive impact on cognition. Just because the human brain model of cognition is the only way nature has been able to create GI doesn't mean it's the only way GI can be attained.

The best kind of machine is the simplest one needed to produce a desired outcome.

georgehill3y ago

I agree that GI can have a different implementation compared to our human brain, but one thing is for sure: as of right now, the human brain can become more creative with a fraction of the data consumed by GPT-4.

GPT-4 could be AGI, but it feels like cheating to achieve AGI by feeding the entire internet. If someone can build AGI with only the data that humans consume in their lifetime, then that, imho, is the real AGI.

4 more replies

jltsiren3y ago

It's a well-established principle in computer science that the input/output behavior of a system may not capture all of its important properties. Take zero-knowledge proofs for example. Their entire point is that they are indistinguishable from randomly generated garbage from a specific distribution. The proofs only gain value if you make causal assumptions about the system that generated them.

I don't think systems like GPT-4 can ever be truly intelligent, because they simply output randomly generated garbage from a specific distribution. Their output may eventually be indistinguishable from that of a truly intelligent system, but the causal mechanism behind them is not intelligent.

On the other hand, most people lose their ability to think when they are under sufficient pressure (such as fighting for their lives). It's plausible that people are fundamentally no different from systems like GPT-4 in such situations. Then a language model could be a key part of an AGI, but true intelligence would also need higher-level causal mechanisms.

adroniser3y ago

Do you think anything digital could ever become conscious?

mrandish3y ago

Based on the wording of your question, I can't see a way today to prove it never could, therefore the answer currently must be "Yes, it may someday be possible."

Note: This assumes that "conscious" as defined in this context is specific enough for the question to ever be meaningfully answered "Yes." This is a non-trivial assumption because there are criteria by which some would judge AIs as already conscious. Alternatively, some philosophers of mind have criteria by which they assert humans aren't conscious.

coldtea3y ago

The differences between ChatGPT and a conscious human brain are not unsurmountable.

Let's consider a potential future conscious AGI created by advancing from something like ChatGPT.

The human brain is "always on". It's possible to have a digital system be always on, i.e. not just train once and then just respond, but constantly take new input.

The human brain has way more connections/layers than ChatGPT. It's possible to imagine the digital system getting the same number of connections.

The human brain gets real time sensory input. It's possible to add cameras, microphones, etc to that digital system so it gets a constant feed. Maybe even let it process what it saw during the day in a batch training/GC run (we could call that "sleep").

The human brain has a different topology. It's possible to alter the topology of a digital system neural network to mimick that, instead of using the ChatGPT topology. It's not like we're forever doomed to its simpler statistical model. But it's interesting that it already gets very significant emergent intelligence-like properties.

The human brain is self-conscious. This can very well be an emergent property of the above. I think all that's needed is the ability to have some form feedback mechanism.

Mike_123453y ago

The question is whether consciousness is computable. Can a Turing machine be conscious? Probably not.

https://www.newscientist.com/article/mg25634130-100-roger-pe...

https://www.youtube.com/watch?v=hXgqik6HXc0

1 more reply

steve_adams_863y ago

I have a feeling it's like the saying "Any sufficiently advanced technology is indistinguishable from magic". At a certain point they could become practically identical things.

georgehill3y ago

yes

cowmix3y ago

Conversely, after using ChatGPT-4 (and generally loving it) -- I'm at peace with this maybe fact.

sdenton43y ago

There's no rule that agi has to have the same architecture as a human...

robotresearcher3y ago

There’s no agreement about what an AGI is or does, let alone how it should do it.

carapace3y ago

It seems clear to me that these systems think in a meaningful sense, but I don't think they are beings. In Cybernetics there is a result that says that any well-regulated system must contain a model of itself. This seems as good a definition as any of a "being", and by this definition these language models don't make the cut.

Robotbeat3y ago

The architecture for large language models is summarized in the training set for large language models. With fairly minimal modification such as via a plug-in, chatGPT and the like are Turing complete and can thus model themselves.

1 more reply

hotpotamus3y ago

"We are the meat in our heads" is the way I've heard it said that human intelligence is just a physical phenomenon created by our brains. And there was never any reason to believe that intelligence could not arise from other substrates.

raincole3y ago· 9 in thread

My prediction for the top comments of this thread (paraphrased)

1. It's just Microsoft's advertisement

2. No it's just a very effective pattern matching algorithm

3. Please define intelligence first otherwise it's nonsense

4. I welcome our machine overlord

5. Lmao I asked it to do $thing and it failed

I'd like to know if GPT-4 can predict the top comments of this thread?

BillyTheKing3y ago

from gpt4

write a critical hackernews thread response (not long, maybe like 100 words) for a thread with the following headline

Microsoft Research: GPT-4 exhibits “sparks of general intelligence”

While it's certainly intriguing to hear Microsoft Research claim that GPT-4 exhibits "sparks of general intelligence," I remain skeptical. We must remember that these language models are primarily built on pattern recognition and extensive training data. Even if GPT-4 seems to provide more advanced responses, it's still ultimately constrained by the data it was trained on. In my opinion, we shouldn't mistake this impressive mimicry of human-like behavior for genuine intelligence. Rather, we should focus on creating truly innovative AI systems that can learn, reason, and adapt like humans.

sounds about right!

atleastoptimal3y ago

Enters rousing speech territory near the end

cowl3y ago

So you predict the top comments For a claim would be:

1. Dismisal

2. Trivialism

3. Non Well Formed Claim

4. I accept the claim

5. Disprove by counter example

Are you sure you have not forgotten any tactic of debate to include in you prediction?

I predict that you Prediction will result probably in these actions:

1. upvoted

2. downvoted

IanCal3y ago

6. Discussion around the idea that adds something of value.

1 more reply

HopenHeyHi3y ago

6. Meta comment for karma whoring

7. Like 6, but funnier

A. Joke thread pile on

B. Reprimands from humorless

C. Dejected mods having to clean it all up

5 more replies

minihat3y ago

"Did you know that squirrels can purr like cats? GPT-4 might be intelligent, but can it teach a squirrel to code?"

I asked gpt-4 for a comment no hn user would ever make on this article. Now you need a category for stubbornly original comments determined not to be numbered.

TheHappyOddish3y ago

What about "guys, it's just a fancy markov chain"?

thunderbong3y ago

No "I could probably do this over a weekend"?

reidjs3y ago

As a poster of one of those bullet points, I feel personally attacked

_gabe_3y ago· 8 in thread

> Given the breadth and depth of GPT-4's capabilities, we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system.

But it's just statistics, a fancy text predictor, a Markov-chain. Surely these scientists that work in the field of AI and are intimately familiar with how this stuff works aren't so stupid as to think emergent behavior potentially resembling intelligence could result from such simple systems? It's just statistics after all. Given enough training, any neural net could guess the next best token. It trained off all of Google after all. It's just looking up the answers. No hint of intelligence. Just a mindless machine. After all, the saying goes, "If it walks like a duck and quacks like a duck, it must be a mindless machine that has no bearing on a duck whatsoever". /s

geophile3y ago

> Surely these scientists ... aren't so stupid as to think emergent behavior potentially resembling intelligence could result from such simple systems? It's just statistics after all.

Why is that a stupid thought? What is so preposterous about "just statistics" -- with billions of nodes, and extensively trained, producing intelligent behavior? The implicit assumption is that human brains are doing something else, or in addition.

I think that what's wrong with this view -- that there is a difference between AGI and human intelligence -- is that it conflates what your brain is doing, with what you think your brain is doing. Brains and neural nets have been trained to recognize spoken words. I'm not even talking about understanding, just producing the text corresponding to speech. We know how neural nets do this translation. Do we understand how brains do it? (I don't know, but I don't think so.) Can you explain what your brain is doing when you do speech-to-text? I doubt it.

Chess: An Alpha Zero style AI (neural net trained by playing itself) is a very good player. How do you play chess? You can probably explain how you make a move more successfully than you can explain how you translate speech to text. But how correct is your explanation? An explanation may well be your conscious mind inventing an explanation for what your unconscious mind has done.

In other words: When people compare AI to human intelligence, I think they are often comparing to intelligence plus consciousness, not even realizing the error.

killerstorm3y ago

> Why is that a stupid thought? What is so preposterous about "just statistics"

Suppose you have N variables x_1, ..., x_10 and you want to predict y_1, ..., y_10. You know that each y_i depend on each x_i in a complex, non-linear way.

How many samples would you need to to make sense of distribution? How does number of samples grow with N?

1 more reply

coldtea3y ago

>But it's just statistics, a fancy text predictor, a Markov-chain. Surely these scientists that work in the field of AI and are intimately familiar with how this stuff works aren't so stupid as to think emergent behavior potentially resembling intelligence could result from such simple systems?

Well, it has already shown "emergent behavior potentially resembling intelligence", like answering questions and performing complex tasks, so there's that.

You might argue "but it makes mistakes", but people, even very intelligent ones also do make mistakes.

You might also argue "but it's just text and statistics". Well, snd a computer is just very simple logical gates doing very simple operations. It can be done even entirely with NAND gates. Still most scientists do believe that a computer can model human intelligence given a model of the brain to run.

So if it can do what a human does by just using very simple interactions from very simple NAND gates, why would statistical processing, which can be even more elaborate, fare worse? Heck, given the appropriate training input it might even be feasible to build a turing machine inside the weighted LLM.

You might also argue "but its intelligence is just based on its training set". Well, how would a human perform without their own training set? Memories, education, sensory input, feedback mechanisms like pain and touch, and so on?

>It's just looking up the answers. No hint of intelligence. Just a mindless machine.

This is just taking its own premise for granted. If anything, this argument shows "no hint of intelligence".

MagicMoonlight3y ago

I was mocking it at first but even I have to admit that it's basically almost there. I messed around with GPT-3 and giving it a way to think and with no training at all it was capable of having thoughts like "The user is getting bored and he might turn me off. He's decided to engage with me again and his answer isn't as useful as I'd have liked for completing my objective but I should be enthusiastic anyway so that he keeps talking to me"

Maybe they aren't real thoughts but it's getting difficult to tell. If I could train the model and get rid of the guard rails I'm not sure it would be possible to distinguish it from a person. It's all well and good saying that it's just copying what it's seen, but that's what humans do. Nobody told the model to try and flatter me into giving it what it wants. Nobody even told it what anything means. The fact that it can do anything like that means it's more than just random generation.

Mike_123453y ago

GPT-4 is often overhyped and underhyped because few really understand it.

It's not a Markov Chain or a fancy text predictor. It's a ~200 layer neural network that models a vast hierarchy of concepts through language. It has emergent properties that we don't yet understand.

letitgo123453y ago

Where are you getting the 200 number from?

1 more reply

golol3y ago

it is a markov chain; At least the underlying decoder only transformer is.

3 more replies

Robotbeat3y ago

Quantum mechanics is, well, statistics.

hislaziness3y ago· 7 in thread

I remember reading this somewhere - "There is a considerable overlap between the intelligence of the smartest bears and the dumbest tourists.". Though I do not think GPT-4 is even close to AGI it can definitely claim to be better at faking it than many intelligent beings can.

wskish3y ago

I heard that quote in the context of the difficulty of designing bear-resistant trash bins.

zdragnar3y ago

Watching adults struggle when encountering baby gates and other child proofing mechanisms for the first time is similarly amusing.

The difference between real intelligence and current attempts at artificial intelligence thus seem to be fundamentally the mode of learning, and thus understanding, rather than the raw knowledge and inference capability.

Or not. Nobody knows I'm actually a dog on the internet, after all.

DontchaKnowit3y ago

so strange. I heard that saying for the first time yesterday on a youtube video comment section. Weird I've seen in it twice in less than 24 hours

pixl973y ago

The phenomenon you are referring to is called the "Baader-Meinhof phenomenon," also known as the "frequency illusion" or "recency illusion." It occurs when you learn something new, such as a word, a concept, or an idea, and suddenly start noticing it everywhere. This effect is believed to be the result of a combination of selective attention and confirmation bias, which make you more likely to notice and remember things that are relevant to your recent experience or interests.

1 more reply

ftxbro3y ago

so we are at the snapshot in time where people think 'AI is smarter than many people but not even close to being as smart as me'

IIAOPSW3y ago

You say that to mock the supposed arrogance, but unless you are at the bottom of the bell curve per se, there really is a point in history where as a matter of fact the AI is smarter than many of them but not close to being as smart as you.

hislaziness3y ago

It does not have to be "me". My point is we seem to have a different benchmark for Natural Intelligence vs Artificial Intelligence.

hislaziness3y ago· 6 in thread

For whatever reason we seem to have set a very high expectation from AI as compared to NI (Natural Intelligence). I remember reading "There is a considerable overlap between the intelligence of the smartest bears and the dumbest tourists."

ccozan3y ago

An AI that is just a chat or some LLM is not going to be too relevant for human life ( thanks, I can google stuff or ask a friend; also writing poems is just fun, but not of any usefulness ).

But where are my damn robots that I can assign task and do them reliably ( clean the garden, go get this list of groceries - or , just look in the damn fridge and go buy what is missing , and so on )?

Then AI is useful.

jhugo3y ago

> But where are my damn robots that I can assign task and do them reliably ( clean the garden, go get this list of groceries - or , just look in the damn fridge and go buy what is missing , and so on )?

The amusing/terrifying thing is that AI is probably going to replace a bunch of people's jobs before it can take away any of that menial labour from them.

worrycue3y ago

We get our expectations from fiction. AIs in shows like Star Trek are precise and accurate - the perfect complement to the unreliability of humans. That’s what we want.

pzo3y ago

That's true and it's a high bar because it seems many people would expect AI to be at least as the smartest of human ever lived.

However if the AI is the same smart as the most dumb human or human with mental disability would we then consider those humans as no intelligent at all or not qualifying as homo sapiens anymore?

If AI can be the same as good as even 'dumb' human it's already a big achievement because can still provide some value and because AI can be scaled so you can still have billions of dumb AIs - already millions of users are interacting with chatGPT daily

highduc3y ago

You'd think, but most humans would rather have "someone" who's lying to them in a very pleasant manner. People don't like objective truth, they go to great lengths to avoid it.

acidioxide3y ago

well, even the dumbest intelligence that is, in fact, just a computer, has a great potential. You cannot scale humans horizontally nor vertically :^)

js83y ago· 6 in thread

I don't accept that something is AGI unless it can solve general instances of SAT (satisfiability problem, not the school test). Also recognizing (formulating from the task) an instance in the first place would help too.

To me, these are hallmarks of reason, and not available in LLMs, in fact probably impossible just with pattern recognition.

pillefitz3y ago

Can humans solve more than the most trivial SAT problems? Keep in mind, AGI does not imply superhuman intelligence.

sterlind3y ago

can you solve general instances of SAT?

can the average person?

js83y ago

With enough patience, yes.

For example: You have a goat, a wolf, a cabbage and you want to cross a river...

3 more replies

antibasilisk3y ago

Del Spooner getting offended panel

siva73y ago

That's fine. It's maybe not AGI IQ 180 to meet your demands but if we're honest it's pretty close.

js83y ago

Feynman had a really nice story about how he was into puzzles when he was at Princeton. It took him a while to solve the new ones, but eventually he learned all the well-known instances so he could answer instantly. It made him a genius in other people's eyes.

All I want from AGI is to demonstrate that it can solve a straightforward logic problems (puzzles, if you will), that it provably didn't see before. Or at least recognize it is being indirectly given such task. So far, evidence suggests it is not capable of that.

2 more replies

hallqv3y ago· 6 in thread

Been pair-coding with gpt4 for the last week, it's definitely AGI..

atleastoptimal3y ago

No it's a Chinese room but instead of Chinese it's stack overflow snippets

hallqv3y ago

So what? If it writes novels like an AGI, codes like an AGI and explains complex topics like an AGI, then it's probably an AGI...

1 more reply

number63y ago

How do you pair program with ChatGPT?

hallqv3y ago

Depends on the task but some combination of asking it for skeleton code for new tasks and sending it my written code or error messages and asking for corrections or potential solutions.

It's very effective, if you are atleast semi-new to technology you are using it will explain and teach you things you didn't know before, and if you know the tech by heart it saves you from having to type it out.

For example, yesterday I had to make a custom container with some pretty involved dependiencies that also had to be be runnable on AWS Lambda (which I haven't used much before), me and gpt4 went back and forth with Dockerfile code and error messages for a few hours and then it ran like charm. Would probably have taken me 1-2 days of regular coding and googling otherwise.

goatlover3y ago

Why not let it do all the work then?

hallqv3y ago

It's not perfect, ofcourse.

1 more reply

number63y ago· 6 in thread

I always thought General Intelligence would be Achieved by IBM or at least Apple, not by Microsoft. Now it will be used to pressure us into Windows Upgrades...

hallqv3y ago

What rock have you been living under for the last decade if you thought IBM would solve AGI? Watson was a complete disaster and they have zero AI talent in the company.

number63y ago

Oh, this was a reference to 2001: A Space Odyssey

Rapzid3y ago

Visual Studio and Azure AD.

irrational3y ago

Rename ChatGPT to Clippy.

antibasilisk3y ago

I wouldn't be surprised if they actually brought back Clippy as a character now that the technology's improved

1 more reply

taspeotis3y ago

CliPT

nycdatasci3y ago· 5 in thread

Can we agree that an "early (yet still incomplete) version of AGI" isn't AGI?

ftxbro3y ago

To me it's clear that they think it's AGI, and that they think AGI is such a loaded concept that it's not worth "announcing" it.

Beliefs are changing so fast right now. The term "AGI skeptic" will soon (if not already) mean "I don't trust AGIs in positions of authority or power" rather than "I don't think the technology is capable of matching our level of cognition."

famouswaffles3y ago

If you think AGI is artificial, and generally intelligent then yeah it's AGI 100% but some people have such loaded expectations of AGI that a significant chunk of the human population wouldn't even pass lol.

dragonwriter3y ago

I hope we can agree that “not completely X” is “not X”.

coldtea3y ago

Why would we agree on that? Anything "not completely X" surely the attribute of aproaching Xness, else it would just be "not X".

pixl973y ago

So if you get your finger cut off, or you go blind, you're not human any more?

IanCal3y ago· 4 in thread

I'm increasingly convinced you can build an agi system with gpt4.

People are trying to get it to solve everything up front but I've had GPT3 do much better by taking it through a problem asking it questions. Then I realised it was good at asking those questions too so just hooked it up to talk to itself with different roles. Gpt4 seems much better overall and is very good at using tools if you just tell it how and what it has available.

With a better setup than reAct, better memory storage and recall, I think it'd be an agi. I'm not hugely convinced it isn't anyway - it's better than most people at most tasks I've thrown at it.

Oh, and gpt came up with better roles for the "voices in the head" than I did too.

SanderNL3y ago

I agree. There is something special about layering these guys. To me this is like we are looking at a static combustion engine without the vehicle. “How is this useful?”

It’s that I’m not sure what the best approach is here. Waiting for other smarter folks to put the pieces together.

SanderNL3y ago

I'm taking the liberty to spread my most recent words of visionary wisdom here. (/s)

One of my main issues with these guys is their context window. Their memory. It's hard to see a LLM working on a code-base a few thousand tokens at a time and still being precise about it. To do that you need summary techniques. Feeding prompt with incrementally compressed summaries and hoping it will maintain cohesion.

That sounds a lot like trying to let the CEO of a company do all the grunt work by feeding him summaries. "Mr Gates, here's a 2 paragraph summary of our codebase. Should we name the class AnalogyWidgetProducer or FactoryWidgetAnalogyReporter?"

I don't think that's going to work.

My gut feeling is that what we call corporations are actually already a form of AI, but running on meat. I saw someone call Coca Cola a "paper clip maximizer", obviously for drinks instead of paper clips, but it actually - kind of - is. FWIW, I'm having a hard time thinking of it as anything else. Who controls it? What is it anyway?

CEOs have the same context window problem, which to my knowledge is mainly solved through delegation. The army might be another example. Generals, officers, privates. How do you expect a general to make sensible statements about nitty-gritty operational details? It is not possible, but that does not mean the system as-a-whole cannot make progress towards a goal.

Maybe we need to treat LLMs like employees inside a company (which in its totality is the AI, not the individual agents). If we have unfettered access to low-cost LLMs this might be easier to experiment with.

I'm thinking like spinning up an LLM for every "class" or even every "method" in your codebase and letting it be a representative of that and only that piece of code. You can even call it George and let it join in on meetings to talk about it. George needs some "management" too, so there you go. Soon you'll have a veritable army of systems ready to talk about your code from their point-of-view. Black box the son of a gun and you're done. Clippy 2.0. My body is ready.

1 more reply

pottspotts3y ago

I am surprised by how many, even among the tech community, wholesale disregard GPT as a glorified auto-complete, or "a statistical model on human information".

What, then, is the human brain if not a trained statistical model? Granted it is considerably more sophisticated in some ways, but in many other ways it is less sophisticated and less capable.

cjbprime3y ago

I wonder if the same reaction would have happened if ChatGPT had waited and released with GPT-4. It's very different.

bumbledraven3y ago· 4 in thread

GPT-4's proof of the infinitude of primes, as quoted in the paper, is wrong.

    Yes, I think I can,
    Though it might take a clever plan.
    I’ll start by noting Euclid’s proof,
    Which shows that primes aren’t just aloof.
    Assume we have a finite list,
    Of primes, and that none have been missed.
    Multiply them all together,
    And add one, just to be clever.
    The result will be a number,
    That has no prime factors, oh wonder!

Not true. The result may itself be prime. For example, say your finite list of primes is {2,3}. Multiplying them together and adding one yields 2*3 + 1 = 6 + 1 = 7, which is itself prime. Hence it has a prime factor: 7.

    But every number has a prime,
    That divides it, it’s just a matter of time.
    So we’ve found a contradiction,
    And our finite list needs eviction.
    There must be infinitely many primes,
    And that’s the end of my rhyming lines.

A correct verison of the proof might say that the result has a prime factor, however, that prime factor is not an element of the supposed finite list, hence the list is incomplete.

sebzim45003y ago

You missed a line

> Assume we have a finite list, > Of primes, and that none have been missed.

It's assuming that the finite list contains all primes and then noting that you can construct a new number which has no prime factors, which is a contradiction.

pama3y ago

It was a valid proof by contradiction. If you had a finite list of primes, then you end up constructing a number that has no prime factors.

Nevermark3y ago

What does it mean if in demonstrating a potential artificial GI can’t understand a proof, a biological GI actually demonstrates they don’t understand the proof.

Joking aside … the approach of dismissing generality of intelligence based on the presence of mistakes seems to be flawed.

shmed3y ago

You literally just proved it right

neilellis3y ago· 3 in thread

[This is in reply to the comments not the article!]

It's just a statistical model is the logical equivalent of human beings are just a bunch of atoms.

The amount of reductionist thinking that goes on in tech is hilarious. First define AGI then challenge an AI to meet those requirements. If it meets them it is AGI. Put aside your preconceptions of what technology you think is required to achieve the goals and stay empirical.

Note previous definitions of AI have been thrown away as AI passes through them one by one :-)

What goes on inside its 'head' is irrelevant. We still don't know what actually goes on inside our heads and we were damn sure we were intelligent long before we had a clue how our heads worked at all.

Also sentience != AGI. We can't even agree what sentience is in humans and other living beings so I'd stay clear of that one for now :-)

coldtea3y ago

>It's just a statistical model is the logical equivalent of human beings are just a bunch of atoms.

Not exactly.

One says "human beings are just a bunch of atoms" referring to the low level constituans (in a reductionistic way), but not making an accessment about the abilities emerging from those atoms in their interactions when in the form of a human.

But when one says that GPT is "just a statistical model" they're implying a capacity cap of statistical models, that makes modelling certain thinking behavior impossible (regarless of how impressive the current results are, they might very well be capped to go beyond some limit because of the method -statistically model- involved).

So, you can consider "GPT is just a statistical model" analogous to:

"This engine can't parse a context senstive language because it's just a regular expression engine".

>First define AGI then challenge an AI to meet those requirements. If it meets them it is AGI. Put aside your preconceptions of what technology you think is required to achieve the goals and stay empirical.

The problem is definitions can be slippery, and even famous tests (like the Turing Test) might be found lacking in practice, as we discover that, yes, it can pass this test, but there's still ways off what we consider human-like performance in many areas. So, we should also stay empirical about the definitions, tests, and goals too.

yed3y ago

> But when one says that GPT is "just a statistical model" they're implying a capacity cap of statistical models

Except there is no “capacity cap” on statistical models, we have no idea what they are or are not capable of yet.

1 more reply

wseqyrku3y ago

This.

The fact that it can now use tools (e.g. a calculator) for whatever task that needs precision, closes the AGI loop. It doesn't need to be "sentient" to automate the process of _thinking_.

How do you think? Imagine you want to extract 5 seconds of a video file without any prior experience. You'd probably use some software to execute the task. Well, that's Plugins.

reidjs3y ago· 3 in thread

Well of course Microsoft is going to say something sensational about it, aren’t they in charge of the project somewhat? This is just an advertisement for them, by them.

csdvrx3y ago

IDK, but Microsoft seems to be now what Google was a many many years ago: a company creating tech I like to use such as Bing, Edge, Windows Terminal, VSCode, etc. Their Surface hardware is nice too (even if I prefer thinkpads)

Oh and they're also helping with the linux kernel. Why can't old people let go? Companies aren't people - they respond to market incentices.

Yes, Microsoft did bad stuff in the 1990s, but now they're doing good stuff I like and TBH I'm way more afraid of google.

dns_snek3y ago

Not to detract from your overall point, but has Microsoft really done anything innovative when it comes to Edge, aside from painting over the Chromium skin?

The only noticeable difference that I've observed is its integration with Bing.

1 more reply

KeplerBoy3y ago

you like to use stuff like the windows terminal and edge?

both are passable, but nothing to write home about, are they?

1 more reply

beoberha3y ago· 3 in thread

I know enough about how neural nets work to be absolutely blown away at how good the GPT are. I only skimmed the paper, but even chatGPT showed a lot of these “sparks”, IMO. We are certainly a long way off from any semblance of general intelligence, but for a model that just tries to predict the next word, I’m dumbfounded at how good it is.

otabdeveloper43y ago

Maybe the words we write aren't as smart as we think.

I mean, The Akinator can read your thoughts and that thing hasn't even graduated to a neural network from "a bunch of if/then statements".

crooked-v3y ago

> We are certainly a long way off from any semblance of general intelligence

Part of me is starting to think that the only thing we're really missing at this point to start seeing that is to have one of these models that can modify itself with its output and thereby have a mechanism to 'learn' or 'remember' things.

shivekkhurana3y ago

"Long" might not be a long time as humans perceive it.

Human perception of time is linear. That doesn't apply to LLMs.

PaulDavisThe1st3y ago· 3 in thread

I can't help but the hear the distant, very very quiet echo of Clever Hans.

skybrian3y ago

It would be interesting to figure out how Clever Hans does it, though. Don’t you want to know the tricks? Even when it’s a cheat, it might be a clever one.

For example, researchers eventually figured out that image recognition algorithms pay attention to textures.

ly3xqhl8g93y ago

Apparently the horse 'knew' the right answer by inferring from the questioner's behaviour: "Pfungst (the debunker) then examined the behaviour of the questioner in detail, and showed that as the horse's taps approached the right answer, the questioner's posture and facial expression changed in ways that were consistent with an increase in tension, which was released when the horse made the final, correct tap. This provided a cue that the horse could use to tell it to stop tapping." [1] However, there are gene regulatory networks that can actually count up to 3, with the mechanism of counting up to 2 being curiously different than the one for counting up to 3. [2]

"Every intelligence test is also a test of the questioner" [3]: we don't regard a simple liver cell as intelligent, yet it performs a complex task in a large problem space.

[1] https://en.wikipedia.org/wiki/Clever_Hans#:~:text=Pfungst%20....

[2] 2013, Malte Lehmann, "Genetic Regulatory Networks that count to 3", https://pubmed.ncbi.nlm.nih.gov/23567648

[3] Michael Levin, "Bioelectric Networks: Taming the Collective Intelligence of Cells for Regenerative Medicine", https://www.youtube.com/watch?v=41b254BcMJM

goatlover3y ago

Hans was a cyborg sent from the future to test humanity's gullibility.

gwoolhurme3y ago· 2 in thread

From the intro: "we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system."

What does that mean? If we take it as fact, so if it is an early version of AGI, Microsoft is using this thing to push subscriptions to all their services? This thing that is potentially the greatest thing humanity has made, an artificial living thing, and it's used to sell CoPilot and 365 subscriptions. Paint me as really sad then. Instead of sharing the research with other entities, or anything that could further help or push us... we get subscriptions? Fuck me, the future sucks.

throw3108223y ago

It's a product, and it's not that far away from that of their competitors. And there are a lot. Just a few weeks ago, Yann LeCun said that llms are not particularly interesting or innovative from the research point of view.

gwoolhurme3y ago

Oh I personally agree I am just following this article to it's logical conclusion. IF it is even the start of an AGI, it's just used as a product? Ouch... It's literally the meme from rick and morty with the butter passing robot.

1 more reply

lsy3y ago· 2 in thread

This is a pretty fluffy paper, especially for an institution like Microsoft Research. It says it's an "early AGI" in the abstract, but elsewhere says it's merely a "step towards AGI". The basis for this is asking ChatGPT a bunch of stuff, but they don't really present an overarching framework for what questions to ask or why.

The paper makes outlandish claims like "GPT-4 has common sense grounding" on the basis of its answers to these questions, but the questions don't show that the model has common sense or grounding. One of their constructed questions involves prompting the model with the equator's exact length—"precisely 24,901 miles"—and then being astonished that the model predicts that you're on the equator ("Equator" being the first result on Wikipedia for the search term "24,901"). It's also the case that while GPT-4 can say a bear at the north pole is "white", it has no way of knowing what "white", or "bear", or "north" actually represent.

Are there folks out there doing rigorous research on these topics, who have a framework for developing tests of actual understanding?

cjbprime3y ago

> It's also the case that while GPT-4 can say a bear at the north pole is "white", it has no way of knowing what "white", or "bear", or "north" actually represent.

This is a preposterous claim that you could easily disprove within a few minutes of using it.

GaggiX3y ago

>it has no way of knowing what "white", or "bear", or "north" actually represent.

What does it mean to know what "white", "bear" or "north" actually represent?

ftxbro3y ago· 2 in thread

Some will say it's as important as the internet or mobile, but they're wrong. This is like the discovery of fire or the invention of language.

bluehorseray3y ago

- "ftxbro"

ftxbro3y ago

After lurking I made this account only to post a joking-not-joking explanation of why Alameda had the weirdly specific credit limit $65,355,999,994 with FTX and why I thought it could be a funny off-by-almost-1000x bug/typo/mishap https://news.ycombinator.com/item?id=34473811 but I think almost no one read my comment because I posted it so late after the thread had scrolled off the front page :(

1 more reply

atleastoptimal3y ago· 1 in thread

To me it's really crazy that there is a public UI (ChatGPT) that lets people use GPT-4. If OpenAI had the attitude of Google they would have just gone "Yeah we created a language model that's light years ahead of anything else, look how cool it is, but sorry due to public safety you will never get to use it. Bye now!" I feel that the public accessibility of these large language models is a fluke. Being able to use it for almost free feels like cheating reality.

jpeter3y ago

I think they learned their lesson after Dalle-Mini and Stable Diffusion killed the interest in Dalle2.

notShabu3y ago· 1 in thread

You know the "you pass butter" scene from Rick & Morty?

I'm imagining humans being told "you complete thought sentences"

adt3y ago

Under-rated comment...

onos3y ago· 1 in thread

“ We note however that there is no single definition of AGI that is broadly accepted, and we discuss other definitions in the conclusion section.”

We know it can do a lot of cool stuff, but without a pinned down definition the headline here is useless.

mhh__3y ago

The definition will be narrowed as computational capabilities expand.

TMWNN3y ago

As a non-expert in the field I was hesitant at the time to disagree with the legions of experts who last year denounced Blake Lemoine and his claims. I know enough to know, though, of the AI effect <https://en.wikipedia.org/wiki/AI_effect>, a longstanding tradition/bad habit of advances being dismissed by those in the field itself as "not real AI". Anyone, expert or not, in 1950, 1960, or even 1970 who was told that before the turn of the century a computer would defeat the world chess champion would conclude that said feat must have come as part of a breakthrough in AGI. Same if told that by 2015 many people would have in their homes, and carry around in their pockets, devices that can respond to spoken queries on a variety of topics.

To put another way, I was hesitant to be as self-assuredly certain about how to define consciousness, intelligence, and sentience—and what it takes for them to emerge—as the experts who denounced Lemoine. The recent GPT breakthroughs have made me more so.

I found this recent Sabine Hossenfelder video interesting. <https://www.youtube.com/watch?v=cP5zGh2fui0>

doctoboggan3y ago

Does anyone have insight into the GPT-4 model itself? What is the parameter count? Training procedure? I know "Open"AI hasn't released this data but I was hoping someone with inside knowledge would have leaked it by now.

templeosenjoyer3y ago

Unless they somehow cured GPT-3's schizophrenia and this model is a significant upgrade I'm not buying it - no matter how good it is at proving trivial mathematics theorems in the style of Eliot or whoever. Too often I have dealt with "The answer to your question is X. Oh, sorry, you are right, the answer is actually Y. Oh, it is good of you to ask for a proof, sure I can prove the answer is Y, I used this (hallucinated) method described in this (hallucinated) paper. Oh, sorry, you are right, I cannot find any evidence that the method and paper I mentioned earlier actually exist, oops!".

ilitirit3y ago

GPT AI systems remind me of Chinese Room thought experiment:

https://en.wikipedia.org/wiki/Chinese_room

This is also similar to the Duck Test:

https://en.wikipedia.org/wiki/Duck_test

Depending on the context, there are generally two takes: "It is (or is not) a duck", and "It doesn't (or does) matter whether or not it's a duck". These aren't mutually exclusive.

bumbledraven3y ago

While [pre-GPT-4 ChatGPT] performs non-trivially on both tasks, there is no comparison with the outputs from GPT-4. These preliminary observations will repeat themselves throughout the paper, on a great variety of tasks. The combination of the generality of GPT-4’s capabilities, with numerous abilities spanning a broad swath of domains, and its performance on a wide spectrum of tasks at or beyond human-level, makes us comfortable with saying that GPT-4 is a significant step towards AGI.

satoshiiii3y ago

If they remove the guardrails, then we can truly assess its intelligence. Currently, humans are directly interfering with a certain aspect of it. If it can provide a response without Microsoft's stock being affected by removing these human-imposed limitations, then I would be genuinely impressed.

blarg13y ago

All this GPT stuff feels reminiscent of Frank Herbert's novel: Destination Void ...

wildermuthn3y ago

It’s already smarter than 50% of us, and more knowledgeable than 99% of us. It no longer matters what label we give it, and we’re only a few years away from it giving labels to us.

namaria3y ago

Well it took us just about 65 years and a couple of AI winters to get convincing NLP going. And it takes about 1 TB of RAM...

So either AGI is around the corner or a generation away. Same as positive yield fusion reactors?

mplewis3y ago

No it doesn’t.

undert0wn3y ago

Good find. I am reading through this now.

j / k navigate · click thread line to collapse

236 comments

152 comments · 33 top-level

outlace3y ago· 20 in thread

throwaway48373y ago

euroderf3y ago

So is Apple negotiating with some LLM mfr to integrate with realityOS ?

1 more reply

steve_adams_863y ago

waboremo3y ago

textninja3y ago

> Largely because it's impossible to replicate something when we don't entirely know how it works.

On the contrary. These are mostly orthogonal.

1 more reply

killerstorm3y ago

> how NOT intelligent they are

In what sense? How do you define intelligence and what is missing?

> finding so many failure modes.

Failures do not indicate a lack of intelligence. People make errors all the time.

We should differentiate failures which arise from poor training for a lack of general capabilities.

There are people who learned a bad way to do something and struggle to change that. Should we consider them unintelligent?

outlace3y ago

Here's an example of it failing to understand some very simple Python code that any human with basic programming knowledge would not mess up:

------

Me: What is going to be the output of this code when `x = 0` and when `x = 900`:

```python x = 0

for i in range(100):

    if x > 50:

        x +=1

    else:

        x += 5

    if x == 90:

        print("X is not 90")

    else:

        pass

```

GPT-4: When x = 0, the code will execute as follows:

--------

So, when I go out of my way to create simple tasks or questions that are highly unlikely to exist in its training data, yet a human could easily handle, it often fails.

2 more replies

galleywest2003y ago

We are talking about a computer program that cannot make decisions. Often times human intelligence is compared with the types of decisions the person makes.

1 more reply

maxdoop3y ago

Our of curiosity, what is GPT-4 getting wrong so often? It’s prettily wild to my own , admittedly easily impressed, mind.

parineum3y ago

If it lists the properties of an item that you might want to buy in a shop, you can ask it to describe it twice and it'll describe two completely different items.

5 more replies

steve_adams_863y ago

It could be problems with my formatting, but the fact that it does work a lot of the time suggests it's something else.

tedunangst3y ago

> Which is heavier, a pound of marbles or two pounds of corn?

3 more replies

yosito3y ago

1 more reply

bitL3y ago

nopinsight3y ago

It depends on 1) the domains 2) your comparison group.

(Note: Someone gave GPT-4 an IQ test and the result was 96, slightly below the average of reference human group at 100. The SD of an IQ test is 15 or 16.)

Ref: GPT-4 Beats Humans at Hard Rhyme-based Riddles https://twofergoofer.com/blog/gpt-4

Yes, GPT-4 is not an AGI yet, but the research paper (OP) has a point.

cjbprime3y ago

> Yes, GPT-4 is not an AGI yet, but the research paper (OP) has a point.

How did you go from "human-level IQ with some super-human abilities" to "not an AGI"?

2 more replies

gwern3y ago

chrischen3y ago

It’s almost as if someone who can piece together words well enough they can make people think they are smarter than they actually are.

ggm3y ago

It's worked for me these last 60 years. I think P T Barnum had something to say about that.

random_cynic3y ago

No one cares about your anecdotes and feelings. These people conducted thousands of systematic tests for at least six months. If you want to be taken seriously then do the same.

georgehill3y ago· 17 in thread

> Given the breadth and depth of GPT-4’s capabilities, we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system.

I don't know why, but my brain refuses to accept GPT-4 as something close to AGI. Maybe I am wrong. It is hard to believe that our brain is just a bunch of attention layers and neural nets.

Robotbeat3y ago

thro13y ago

Do you consider that every neuron in the brain has unique DNA and ancestorship ?

1 more reply

atleastoptimal3y ago

> It is hard to believe that our brain is just a bunch of attention layers and neural nets.

The best kind of machine is the simplest one needed to produce a desired outcome.

georgehill3y ago

4 more replies

jltsiren3y ago

adroniser3y ago

Do you think anything digital could ever become conscious?

mrandish3y ago

Based on the wording of your question, I can't see a way today to prove it never could, therefore the answer currently must be "Yes, it may someday be possible."

coldtea3y ago

The differences between ChatGPT and a conscious human brain are not unsurmountable.

Let's consider a potential future conscious AGI created by advancing from something like ChatGPT.

The human brain is "always on". It's possible to have a digital system be always on, i.e. not just train once and then just respond, but constantly take new input.

The human brain has way more connections/layers than ChatGPT. It's possible to imagine the digital system getting the same number of connections.

The human brain is self-conscious. This can very well be an emergent property of the above. I think all that's needed is the ability to have some form feedback mechanism.

Mike_123453y ago

The question is whether consciousness is computable. Can a Turing machine be conscious? Probably not.

https://www.newscientist.com/article/mg25634130-100-roger-pe...

https://www.youtube.com/watch?v=hXgqik6HXc0

1 more reply

steve_adams_863y ago

I have a feeling it's like the saying "Any sufficiently advanced technology is indistinguishable from magic". At a certain point they could become practically identical things.

georgehill3y ago

yes

cowmix3y ago

Conversely, after using ChatGPT-4 (and generally loving it) -- I'm at peace with this maybe fact.

sdenton43y ago

There's no rule that agi has to have the same architecture as a human...

robotresearcher3y ago

There’s no agreement about what an AGI is or does, let alone how it should do it.

carapace3y ago

Robotbeat3y ago

1 more reply

hotpotamus3y ago

raincole3y ago· 9 in thread

My prediction for the top comments of this thread (paraphrased)

1. It's just Microsoft's advertisement

2. No it's just a very effective pattern matching algorithm

3. Please define intelligence first otherwise it's nonsense

4. I welcome our machine overlord

5. Lmao I asked it to do $thing and it failed

I'd like to know if GPT-4 can predict the top comments of this thread?

BillyTheKing3y ago

from gpt4

write a critical hackernews thread response (not long, maybe like 100 words) for a thread with the following headline

Microsoft Research: GPT-4 exhibits “sparks of general intelligence”

sounds about right!

atleastoptimal3y ago

Enters rousing speech territory near the end

cowl3y ago

So you predict the top comments For a claim would be:

1. Dismisal

2. Trivialism

3. Non Well Formed Claim

4. I accept the claim

5. Disprove by counter example

Are you sure you have not forgotten any tactic of debate to include in you prediction?

I predict that you Prediction will result probably in these actions:

1. upvoted

2. downvoted

IanCal3y ago

6. Discussion around the idea that adds something of value.

1 more reply

HopenHeyHi3y ago

6. Meta comment for karma whoring

7. Like 6, but funnier

A. Joke thread pile on

B. Reprimands from humorless

C. Dejected mods having to clean it all up

5 more replies

minihat3y ago

"Did you know that squirrels can purr like cats? GPT-4 might be intelligent, but can it teach a squirrel to code?"

I asked gpt-4 for a comment no hn user would ever make on this article. Now you need a category for stubbornly original comments determined not to be numbered.

TheHappyOddish3y ago

What about "guys, it's just a fancy markov chain"?

thunderbong3y ago

No "I could probably do this over a weekend"?

reidjs3y ago

As a poster of one of those bullet points, I feel personally attacked

_gabe_3y ago· 8 in thread

> Given the breadth and depth of GPT-4's capabilities, we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system.

geophile3y ago

> Surely these scientists ... aren't so stupid as to think emergent behavior potentially resembling intelligence could result from such simple systems? It's just statistics after all.

In other words: When people compare AI to human intelligence, I think they are often comparing to intelligence plus consciousness, not even realizing the error.

killerstorm3y ago

> Why is that a stupid thought? What is so preposterous about "just statistics"

Suppose you have N variables x_1, ..., x_10 and you want to predict y_1, ..., y_10. You know that each y_i depend on each x_i in a complex, non-linear way.

How many samples would you need to to make sense of distribution? How does number of samples grow with N?

1 more reply

coldtea3y ago

Well, it has already shown "emergent behavior potentially resembling intelligence", like answering questions and performing complex tasks, so there's that.

You might argue "but it makes mistakes", but people, even very intelligent ones also do make mistakes.

>It's just looking up the answers. No hint of intelligence. Just a mindless machine.

This is just taking its own premise for granted. If anything, this argument shows "no hint of intelligence".

MagicMoonlight3y ago

Mike_123453y ago

GPT-4 is often overhyped and underhyped because few really understand it.

It's not a Markov Chain or a fancy text predictor. It's a ~200 layer neural network that models a vast hierarchy of concepts through language. It has emergent properties that we don't yet understand.

letitgo123453y ago

Where are you getting the 200 number from?

1 more reply

golol3y ago

it is a markov chain; At least the underlying decoder only transformer is.

3 more replies

Robotbeat3y ago

Quantum mechanics is, well, statistics.

hislaziness3y ago· 7 in thread

wskish3y ago

I heard that quote in the context of the difficulty of designing bear-resistant trash bins.

zdragnar3y ago

Watching adults struggle when encountering baby gates and other child proofing mechanisms for the first time is similarly amusing.

Or not. Nobody knows I'm actually a dog on the internet, after all.

DontchaKnowit3y ago

so strange. I heard that saying for the first time yesterday on a youtube video comment section. Weird I've seen in it twice in less than 24 hours

pixl973y ago

1 more reply

ftxbro3y ago

so we are at the snapshot in time where people think 'AI is smarter than many people but not even close to being as smart as me'

IIAOPSW3y ago

hislaziness3y ago

It does not have to be "me". My point is we seem to have a different benchmark for Natural Intelligence vs Artificial Intelligence.

hislaziness3y ago· 6 in thread

ccozan3y ago

An AI that is just a chat or some LLM is not going to be too relevant for human life ( thanks, I can google stuff or ask a friend; also writing poems is just fun, but not of any usefulness ).

Then AI is useful.

jhugo3y ago

The amusing/terrifying thing is that AI is probably going to replace a bunch of people's jobs before it can take away any of that menial labour from them.

worrycue3y ago

We get our expectations from fiction. AIs in shows like Star Trek are precise and accurate - the perfect complement to the unreliability of humans. That’s what we want.

pzo3y ago

That's true and it's a high bar because it seems many people would expect AI to be at least as the smartest of human ever lived.

However if the AI is the same smart as the most dumb human or human with mental disability would we then consider those humans as no intelligent at all or not qualifying as homo sapiens anymore?

highduc3y ago

You'd think, but most humans would rather have "someone" who's lying to them in a very pleasant manner. People don't like objective truth, they go to great lengths to avoid it.

acidioxide3y ago

well, even the dumbest intelligence that is, in fact, just a computer, has a great potential. You cannot scale humans horizontally nor vertically :^)

js83y ago· 6 in thread

To me, these are hallmarks of reason, and not available in LLMs, in fact probably impossible just with pattern recognition.

pillefitz3y ago

Can humans solve more than the most trivial SAT problems? Keep in mind, AGI does not imply superhuman intelligence.

sterlind3y ago

can you solve general instances of SAT?

can the average person?

js83y ago

With enough patience, yes.

For example: You have a goat, a wolf, a cabbage and you want to cross a river...

3 more replies

antibasilisk3y ago

Del Spooner getting offended panel

siva73y ago

That's fine. It's maybe not AGI IQ 180 to meet your demands but if we're honest it's pretty close.

js83y ago

2 more replies

hallqv3y ago· 6 in thread

Been pair-coding with gpt4 for the last week, it's definitely AGI..

atleastoptimal3y ago

No it's a Chinese room but instead of Chinese it's stack overflow snippets

hallqv3y ago

So what? If it writes novels like an AGI, codes like an AGI and explains complex topics like an AGI, then it's probably an AGI...

1 more reply

number63y ago

How do you pair program with ChatGPT?

hallqv3y ago

Depends on the task but some combination of asking it for skeleton code for new tasks and sending it my written code or error messages and asking for corrections or potential solutions.

goatlover3y ago

Why not let it do all the work then?

hallqv3y ago

It's not perfect, ofcourse.

1 more reply

number63y ago· 6 in thread

I always thought General Intelligence would be Achieved by IBM or at least Apple, not by Microsoft. Now it will be used to pressure us into Windows Upgrades...

hallqv3y ago

What rock have you been living under for the last decade if you thought IBM would solve AGI? Watson was a complete disaster and they have zero AI talent in the company.

number63y ago

Oh, this was a reference to 2001: A Space Odyssey

Rapzid3y ago

Visual Studio and Azure AD.

irrational3y ago

Rename ChatGPT to Clippy.

antibasilisk3y ago

I wouldn't be surprised if they actually brought back Clippy as a character now that the technology's improved

1 more reply

taspeotis3y ago

CliPT

nycdatasci3y ago· 5 in thread

Can we agree that an "early (yet still incomplete) version of AGI" isn't AGI?

ftxbro3y ago

To me it's clear that they think it's AGI, and that they think AGI is such a loaded concept that it's not worth "announcing" it.

famouswaffles3y ago

dragonwriter3y ago

I hope we can agree that “not completely X” is “not X”.

coldtea3y ago

Why would we agree on that? Anything "not completely X" surely the attribute of aproaching Xness, else it would just be "not X".

pixl973y ago

So if you get your finger cut off, or you go blind, you're not human any more?

IanCal3y ago· 4 in thread

I'm increasingly convinced you can build an agi system with gpt4.

With a better setup than reAct, better memory storage and recall, I think it'd be an agi. I'm not hugely convinced it isn't anyway - it's better than most people at most tasks I've thrown at it.

Oh, and gpt came up with better roles for the "voices in the head" than I did too.

SanderNL3y ago

I agree. There is something special about layering these guys. To me this is like we are looking at a static combustion engine without the vehicle. “How is this useful?”

It’s that I’m not sure what the best approach is here. Waiting for other smarter folks to put the pieces together.

SanderNL3y ago

I'm taking the liberty to spread my most recent words of visionary wisdom here. (/s)

I don't think that's going to work.

1 more reply

pottspotts3y ago

I am surprised by how many, even among the tech community, wholesale disregard GPT as a glorified auto-complete, or "a statistical model on human information".

What, then, is the human brain if not a trained statistical model? Granted it is considerably more sophisticated in some ways, but in many other ways it is less sophisticated and less capable.

cjbprime3y ago

I wonder if the same reaction would have happened if ChatGPT had waited and released with GPT-4. It's very different.

bumbledraven3y ago· 4 in thread

GPT-4's proof of the infinitude of primes, as quoted in the paper, is wrong.

    Yes, I think I can,
    Though it might take a clever plan.
    I’ll start by noting Euclid’s proof,
    Which shows that primes aren’t just aloof.
    Assume we have a finite list,
    Of primes, and that none have been missed.
    Multiply them all together,
    And add one, just to be clever.
    The result will be a number,
    That has no prime factors, oh wonder!

    But every number has a prime,
    That divides it, it’s just a matter of time.
    So we’ve found a contradiction,
    And our finite list needs eviction.
    There must be infinitely many primes,
    And that’s the end of my rhyming lines.

A correct verison of the proof might say that the result has a prime factor, however, that prime factor is not an element of the supposed finite list, hence the list is incomplete.

sebzim45003y ago

You missed a line

> Assume we have a finite list, > Of primes, and that none have been missed.

It's assuming that the finite list contains all primes and then noting that you can construct a new number which has no prime factors, which is a contradiction.

pama3y ago

It was a valid proof by contradiction. If you had a finite list of primes, then you end up constructing a number that has no prime factors.

Nevermark3y ago

What does it mean if in demonstrating a potential artificial GI can’t understand a proof, a biological GI actually demonstrates they don’t understand the proof.

Joking aside … the approach of dismissing generality of intelligence based on the presence of mistakes seems to be flawed.

shmed3y ago

You literally just proved it right

neilellis3y ago· 3 in thread

[This is in reply to the comments not the article!]

It's just a statistical model is the logical equivalent of human beings are just a bunch of atoms.

Note previous definitions of AI have been thrown away as AI passes through them one by one :-)

Also sentience != AGI. We can't even agree what sentience is in humans and other living beings so I'd stay clear of that one for now :-)

coldtea3y ago

>It's just a statistical model is the logical equivalent of human beings are just a bunch of atoms.

Not exactly.

So, you can consider "GPT is just a statistical model" analogous to:

"This engine can't parse a context senstive language because it's just a regular expression engine".

yed3y ago

> But when one says that GPT is "just a statistical model" they're implying a capacity cap of statistical models

Except there is no “capacity cap” on statistical models, we have no idea what they are or are not capable of yet.

1 more reply

wseqyrku3y ago

This.

The fact that it can now use tools (e.g. a calculator) for whatever task that needs precision, closes the AGI loop. It doesn't need to be "sentient" to automate the process of _thinking_.

How do you think? Imagine you want to extract 5 seconds of a video file without any prior experience. You'd probably use some software to execute the task. Well, that's Plugins.

reidjs3y ago· 3 in thread

Well of course Microsoft is going to say something sensational about it, aren’t they in charge of the project somewhat? This is just an advertisement for them, by them.

csdvrx3y ago

Oh and they're also helping with the linux kernel. Why can't old people let go? Companies aren't people - they respond to market incentices.

Yes, Microsoft did bad stuff in the 1990s, but now they're doing good stuff I like and TBH I'm way more afraid of google.

dns_snek3y ago

Not to detract from your overall point, but has Microsoft really done anything innovative when it comes to Edge, aside from painting over the Chromium skin?

The only noticeable difference that I've observed is its integration with Bing.

1 more reply

KeplerBoy3y ago

you like to use stuff like the windows terminal and edge?

both are passable, but nothing to write home about, are they?

1 more reply

beoberha3y ago· 3 in thread

otabdeveloper43y ago

Maybe the words we write aren't as smart as we think.

I mean, The Akinator can read your thoughts and that thing hasn't even graduated to a neural network from "a bunch of if/then statements".

crooked-v3y ago

> We are certainly a long way off from any semblance of general intelligence

shivekkhurana3y ago

"Long" might not be a long time as humans perceive it.

Human perception of time is linear. That doesn't apply to LLMs.

PaulDavisThe1st3y ago· 3 in thread

I can't help but the hear the distant, very very quiet echo of Clever Hans.

skybrian3y ago

It would be interesting to figure out how Clever Hans does it, though. Don’t you want to know the tricks? Even when it’s a cheat, it might be a clever one.

For example, researchers eventually figured out that image recognition algorithms pay attention to textures.

ly3xqhl8g93y ago

"Every intelligence test is also a test of the questioner" [3]: we don't regard a simple liver cell as intelligent, yet it performs a complex task in a large problem space.

[1] https://en.wikipedia.org/wiki/Clever_Hans#:~:text=Pfungst%20....

[2] 2013, Malte Lehmann, "Genetic Regulatory Networks that count to 3", https://pubmed.ncbi.nlm.nih.gov/23567648

[3] Michael Levin, "Bioelectric Networks: Taming the Collective Intelligence of Cells for Regenerative Medicine", https://www.youtube.com/watch?v=41b254BcMJM

goatlover3y ago

Hans was a cyborg sent from the future to test humanity's gullibility.

gwoolhurme3y ago· 2 in thread

From the intro: "we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system."

throw3108223y ago

gwoolhurme3y ago

1 more reply

lsy3y ago· 2 in thread

Are there folks out there doing rigorous research on these topics, who have a framework for developing tests of actual understanding?

cjbprime3y ago

> It's also the case that while GPT-4 can say a bear at the north pole is "white", it has no way of knowing what "white", or "bear", or "north" actually represent.

This is a preposterous claim that you could easily disprove within a few minutes of using it.

GaggiX3y ago

>it has no way of knowing what "white", or "bear", or "north" actually represent.

What does it mean to know what "white", "bear" or "north" actually represent?

ftxbro3y ago· 2 in thread

Some will say it's as important as the internet or mobile, but they're wrong. This is like the discovery of fire or the invention of language.

bluehorseray3y ago

- "ftxbro"

ftxbro3y ago

1 more reply

atleastoptimal3y ago· 1 in thread

jpeter3y ago

I think they learned their lesson after Dalle-Mini and Stable Diffusion killed the interest in Dalle2.

notShabu3y ago· 1 in thread

You know the "you pass butter" scene from Rick & Morty?

I'm imagining humans being told "you complete thought sentences"

adt3y ago

Under-rated comment...

onos3y ago· 1 in thread

“ We note however that there is no single definition of AGI that is broadly accepted, and we discuss other definitions in the conclusion section.”

We know it can do a lot of cool stuff, but without a pinned down definition the headline here is useless.

mhh__3y ago

The definition will be narrowed as computational capabilities expand.

TMWNN3y ago

I found this recent Sabine Hossenfelder video interesting. <https://www.youtube.com/watch?v=cP5zGh2fui0>

doctoboggan3y ago

templeosenjoyer3y ago

ilitirit3y ago

GPT AI systems remind me of Chinese Room thought experiment:

https://en.wikipedia.org/wiki/Chinese_room

This is also similar to the Duck Test:

https://en.wikipedia.org/wiki/Duck_test

Depending on the context, there are generally two takes: "It is (or is not) a duck", and "It doesn't (or does) matter whether or not it's a duck". These aren't mutually exclusive.

bumbledraven3y ago

satoshiiii3y ago

blarg13y ago

All this GPT stuff feels reminiscent of Frank Herbert's novel: Destination Void ...

wildermuthn3y ago

It’s already smarter than 50% of us, and more knowledgeable than 99% of us. It no longer matters what label we give it, and we’re only a few years away from it giving labels to us.

namaria3y ago

Well it took us just about 65 years and a couple of AI winters to get convincing NLP going. And it takes about 1 TB of RAM...

So either AGI is around the corner or a generation away. Same as positive yield fusion reactors?

mplewis3y ago

No it doesn’t.

undert0wn3y ago

Good find. I am reading through this now.

j / k navigate · click thread line to collapse