Simply explained: How does GPT work? (opens in new tab)

(confusedbit.dev)

410 pointsnitnelave3y ago382 comments

382 comments

215 comments · 26 top-level

I’d be interested in hearing from anyone who takes the Chinese Room scenario seriously, or at least can see how it applies to any of this.

I cannot see that it matters if a computer understands something. If it quacks like a duck and walks like a duck, and your only need is for it to quack and walk like a duck, then it doesn’t matter if it’s actually a duck or not for all intents and purposes.

It only matters if you probe beyond the realm at which you previously decided it matters (e.g roasting and eating it), at which point you are also insisting that it walk, quack and TASTE like a duck. So then you quantify that, change the goalposts, and assess every prospective duck against that.

And if one comes along that matches all of those but doesn’t have wings, then if you deny it to be a duck FOR ALL INTENTS AND PURPOSES it simply means you didn’t specify your requirements.

I’m no philosopher, but if your argument hinges on moving goalposts until purity is reached, and your basic assumption is that the requirements for purity are infinite, then it’s not a very useful argument.

It seems to me to posit that to understand requires that the understandee is human. If that’s the case we just pick another word for it and move on with our lives.

mhink3y ago

In my understanding of the Chinese Room example, the resolution to the argument is that the *human* may not understand Chinese, but the *system as a whole* can be said to understand it.

With this in mind, I think asking whether ChatGPT *in and of itself* is "conscious" or has "agency" is sort of like asking if the speech center of a particular human's brain is "conscious" or has "agency": it's not really a question that makes sense, because the speech center of a brain is just one part of a densely interconnected system that we only interpret as a "mind" when considered in its totality.

pegasus3y ago

Good point, that very much vibes with my thoughts on this matter. Lately, I've been contemplating the analogy between the role LLMs might take within society with that of the brain's language center* in human behavior. There's definitely a way in which we resemble these models. More than some might like to admit. The cleverness, but also the hallucinating, gaslighting and other such behaviors.

And on the other hand, any way you'd slice it, it seems to me LLMs - and software systems in general - necessarily lack intrinsic motivation. By definition, any goal it has can only be the goal of whoever designed that system. Even if its maker decides - "let it pick goals randomly", those randomly picked goals are just intermediate steps toward the enacting of the programmer's original goal. Robert Miles' YouTube videos on alignment shed light on these issues also. For example: https://www.youtube.com/watch?v=hEUO6pjwFOo

Another relevant source on these issues is the book "The Master and his Emissary", which discusses how basically the language center can, in some way - I'm simplifying a lot, fall prey to the illusion that "it" is the entirety of human consciousness.

* or at least some subsystems of that language center, it's important to remember how little we still understand of human cognition

brycedriesenga3y ago

What goals do we have that aren't essentially all boiled down to whatever evolution, genetics, and our environment have sorted of molded into us?

1 more reply

HDThoreaun3y ago

I don't think that's the correct take for the room. Say the human speaks english. If you asked them what the conversation was about, and they had the full resources of the room at their disposal could they tell you? No, because the room doesn't actually allow them to understand chinese, it's just a symbol lookup table. The lookup table doesn't mean the system understands chinese, just the relationship between symbols that can lead to a coherent output.

addcommitpush3y ago

What if the human learns all the rules? Then the system as a whole is the human.

kmod3y ago

Here's an example that I think garners more agreement that properties of a limit ("really understanding") don't necessarily mean that any path towards that limit has the properties of the limit. I think there's a lot of room for disagreement about whether this is a factually-accurate analogy and I'm not trying to argue either way on that, just trying to answer your question about how one might make these sorts of arguments if one has a certain belief about the facts.

Let's say we're trying to build a calculator that only needs to do integer addition. And we decide to build it by building a giant if-else chain that hardcodes the answer to each and every possible addition. And due to finite resources, we're going to hardcode all the additions of integers up to absolute value N, but we will increase N over time.

Everything you said applies equally to this situation: it quacks like a duck, and when we talk about things it can't do we have to continually move the goalposts each time a new version comes out. It also has the property that there is a "scaling law" that says that each time you double N you get predictably better performance from the system, and you can do this without bound and continually approach a limit where it can answer any question indistinguishably from something we might call "true understanding".

But I think it's a bit easier to agree that in this case that it's not "really doing" addition and is a bit short of our wish to have an artificial addition system. And if someone touts this system as the way to automate addition we might feel a bit irritated.

Again, many people will say that this is a bad analogy because LLMs operate quite differently, and I'm not trying to argue for or against that. Just trying to give my explanation for how a certain understanding of the facts can imply the kind of conclusion that you are trying to understand.

lordnacho3y ago

I think what this points towards is that we care about the internal mechanism. If we prod it externally and it gives the wrong answer, then the internal mechanism is definitely wrong. But if we get the right answers and then open it up and find the internals are still wrong, it's still wrong.

This illuminates a contradiction: the walks like a duck thing is incompatible with the internals being a duck. If you see a creature with feathers that waddles and can fly, it might still be a robot when you open it. So your test cannot just rely on external tests. But you also want to create a definition of artificial intelligence that doesn't depend on being made of meat and electricity.

simonh3y ago

> I think what this points towards is that we care about the internal mechanism.

The mechanism is what makes a system interesting.

In software this is why we develop libraries of algorithms and code we can reuse and compose into new solutions. The programmer is providing the intellectual flexibility, while the code is the set of capabilities. It’s why this is a superior approach, compared to building a single monolithic mass of procedural code from scratch in a single variable scope for every program we write.

Solutions matter because it’s not just about what a system can do now, it’s about what it can learn or be adapted to do next.

pegasus3y ago

I think @dvt's comment above is a good attempt at answering this question. I agree with him that intrinsic motivation and a capacity for suffering, hope and all the other emotions (which we share with pretty much all animals, if not plants too) are at the top of the list. Cleverness is there also, but not at the top of the list.

brotchie3y ago

re: Calculator examples, I kinda see them both as information systems that achieve a result but there's a phase difference in where the information is stored.

Similar to how space is 4D such that with relativity going faster in a spatial dimension kind of "borrows" from the time dimension (in a hand wavy way).

By analogy, you can have something that's purely a lookup table, or on the other hand, completely based on an algorithm, and the full lookup table is kind of "borrowing" from the algorithmic dimension of the information system space and vice-verse the fully algorithmic version is borrowing from the hardcoded dimension of the information system space.

Under the condition that you're adding integers below N, then if you consider BOTH the (hardcoded, algorithmic) as a singular space (as with 4D space time) then they are equivalent.

Need to work on this theory further to make it more understandable, but I think this way about intelligence.

Intelligence sits as a pattern in the information system space that can range anywhere from hardcoded to algorithmic (if we choose to orthogonalize the space this way). But what actually matters is the system's future impact on it's local laws of physics, and for that purpose both implementations are equivalent.

Edit: Conversation with GPT-4 about this https://sharegpt.com/c/Sbs4XgI

continuational3y ago

The only thing that separates your mechanism for doing addition from what computers actually do is efficiency. Computers can only add numbers up to some fixed size, e.g. 64 bits, and you have to use repetition to add anything larger.

Does that mean computers are not "really doing" addition?

simonh3y ago

There’s a lot more different than efficiency. We can program computers with algorithms capable of computing any possible addition, the limitation being only the memory of the computer and time, not the algorithm itself. Those algorithms are genuinely doing addition in a way that a pre-computed lookup table is not. It’s the difference between computing an addition in your head and just remembering that 2 + 2 is 4.

iliane53y ago

What I find really entertaining is the "just predicting the next token" argument.

If just predicting the next token can produce similar or better results than the almighty human intelligence on some tasks, then maybe there's a bit of hubris in how smart we think we actually are.

tines3y ago

> If just predicting the next token can produce similar or better results than the almighty human intelligence on some tasks

But it's not better than almighty human intelligence, it _is_ human intelligence, because it was trained on a mass of some of the best human intelligence in all recorded history (I say this because the good stuff like Aristotle got preserved while the garbage disappeared (this was true until the recent internet age, in which garbage survives as well as the gold)).

> then maybe there's a bit of hubris in how smart we think we actually are

I feel like you could say this if ChatGPT or whatever obtained its knowledge some other way than direct guidance from humans, but since we hand-fed it the answers, it falls a little flat for me.

I'm open to persuasion.

autokad3y ago

chatgpt doesnt just feed us back answers we already taught it. It learned relationships and semantics so it can apply that knowledge to do something novel. For instance, I took the basic of a dream and told it to turn it into a short story. The short story wasn't bad. I said make it more exciting, it updated the story such that one of the cars exploded. I guess chatgpt learned excitement from michael bay.

1 more reply

iliane53y ago

> But it's not better than almighty human intelligence, it _is_ human intelligence, because it was trained on a mass of some of the best human intelligence in all recorded history

Sure, I was saying "better" in the sense that if for X task, it can do better than Y% of humans.

> since we hand-fed it the answers, it falls a little flat for me

We didn't really hand-fed it any answers though did we? If you put a human in a white box all its life, with access to the entire dataset on a screen but no social interaction, nothing to see aside from the text, nothing to hear, nothing to feel, nothing to taste, etc, it'd be very impressed if they were then able to create answers that seem to display such thoughtful and complex understanding of the world.

1 more reply

simonh3y ago

I think it’s undeniable that LLMs encode knowledge, but the way they do so and what their answers imply, compared to what the same answer from a human would imply, are completely different.

For example if a human explains the process for solving a mathematical problem, we know that person knows how to solve that problem. That’s not necessarily true of an LLM. They can give such explanations because they have been trained on many texts explaining those procedures, therefore they can generate texts of that form. However texts containing an actual mathematical problem and the workings for solving it are a completely different class of text for an LLM. The probabilistic token weightings for the maths text explanation don’t help at all. So yes these are fascinating, knowledgeable and even in some ways very intelligent systems. However it a radically different form of intelligence from us, in ways we find difficult to reason about.

iliane53y ago

Well it's like birds and airplanes. Do airplanes "fly" in the same sense that birds do? Of course not, birds flap their wings and airplanes need to be built, fueled and flown by humans. You could argue that the way birds fly is "more natural" or superior in some ways but I've yet to see a bird fly Mach 3.

If you replace the analogy with humans and LLMs, LLMs won't ever reason or understand things in the same way we do, but if/when their output gets much smarter than us across the board, will it really matter?

1 more reply

int_19h3y ago

Thing is, they can still solve the problem, even if the problem was not one from its training set.

And, more importantly, they solve the problem much better if you tell them to reason about it in writing first before giving the final answer.

1 more reply

goldfeld3y ago

[0]if we get a bit quantum (or God for some), then backtracking could happen by collapsing the dead-ends and "changing" history to stay with what turns out to be the solid plan. Could emergent conscience on AI's neurons do the planning and reasoning that it rather seems to be doing but ML experts will say it is not? If our conscience could by any chance reside not in the electrical currents of the wetware, could AI's reason also not reside in tokens? Is there some mysterious process possibly taking place and will philosophy probe it?

0: pasted from another thread

majormajor3y ago

We've known for a long time that computers can do calculations far, far, far faster than us.

We continue to figure out new ways to make those calculations do more complicated things faster than humans.

What is intelligence beyond calculation is an ancient question, but not the one I'm most interested in at the moment, re: today's tools.

I'm curious right now about if there's meaning to other people in human creation vs automation creation. E.g. is there a meaningful difference between an algorithm curating a feed of human-made TikTok videos and an algorithm both curating and creating a feed of human-made TikTok videos.

Both qualitatively in terms of "would people engage with it to the same level" and quantitatively in terms of "how many new trends would emerge, how would they vary, how does that machine ecosystem of content generation behave compared to a human one" if you remove any human curation/training/feedback/nudging/etc from the flow beyond just "how many views/likes did you get?"

iliane53y ago

I think as soon as text2video gets really good (like midjourney level), there’s gonna be so much AI generated content that unless it’s all extremely good, human made content will be something people search specifically for.

As for curation, I think the success of TikTok proves that you don’t need that much data to pretty preceding pinpoint what someone wants to watch (or what will get them to spend the most time on the app at least).

1 more reply

opportune3y ago

I don’t think there’s anything making it impossible for actual intelligence to arise from a task as simple as “predicting the next token (to model human thought/speech/writing)” because with enough compute resources, smart AI implementations, and training that task basically would be optimized by becoming a general intelligence.

But it’s clear based on current implementations that once you work backwards from the knowledge that it’s “just predicting the next token” you can easily find situations in which the AI doesn’t demonstrate general intelligence. This is most obvious when it comes to math, but it’s also apparent in hallucinations and the model not being able to reason through/synthesize ideas very well, deviate from the script (instead of just answering a question with what it has already, in some cases it should not even try to answer and instead ask more clarifying questions). To be fair, there are plenty of humans with excellent writing or speaking skills that are bad at that kind of stuff too.

1 more reply

feoren3y ago

That'd be like saying that search engines are smarter than the almighty human intelligence because they know the capitals of every country while most humans don't. No, it just has access to a lot of data near-instantaneously. Just like GPT-4 does. It's the enormity of compiled human knowledge that is "smart" in GPT-4. It absolutely is "just predicting the next token", and it turns out that's enough to be an astoundingly intelligent-seeming system when trained on thousands of years of human knowledge. Of course it is! It's like in Avatar: The Last Airbender when he consults with his thousand past-lives at once for wisdom. GPT-4 lets us consult with the collective knowledge of humanity! It's absolutely amazing! And it's also "just predicting the next token". Those are both true.

pegasus3y ago

There's definitely hubris in how clever we consider ourselves. And encountering these AIs will hopefully bring a healthy adjustment there. But another manifestation of our hubris is the way we over-valorize our cleverness, making us feel oh so superior to other species, for example. Emotions, desires, agency, which we share with our animal cousins (and plants maybe also), but which software systems lack, are equally important to our life experience.

bnralt3y ago

> I’m no philosopher, but if your argument hinges on moving goalposts until purity is reached, and your basic assumption is that the requirements for purity are infinite, then it’s not a very useful argument.

I imagine many definitions are initially rather broad and only get refined down over time. Laertius gives us a classic example:

> Plato defined man thus: “Man is a two-footed, featherless animal,” and was much praised for the definition; so Diogenes plucked a cock and brought it into his school, and said, “This is Plato’s man.” On which account this addition was made to the definition, “With broad flat nails.”

I don’t think it’s correct to think of that as infinitely moving goalposts, however. More that the weakness of definitions isn’t always immediately transparent.

eternalban3y ago

> I cannot see that it matters if a computer understands something.

I am not sure they can, but the difference is profound and material. A machine that actually understands, like a human being, is not going to be (can not be) entirely truthful or transparent. There will be private inner thoughts, idea formation, and possibly even willful intent, as a direct consequence of understanding. And the nature of interactions, regardless of superficial similarity, shifts from one of utility to relationship. For example, we would care to know if e.g. the systems entrusted with apocalyptic deterent forces are mechanisms or organisms.

Please note that not a single one of us has ever interacted with any intelligent life form lacking a sense of self, or an ego. Thus, all our sensory registers of another 'intelligent being' are learned in a context of the implicit 'this other is like me'. We are not equipped to distinguish or articulate intelligence (in the abstract) merely based on sensory information. Note that even non-verbal communication, such as jabbing a friend in the ribs, are all learned to have a certain meaning in that very same context of implicits, and any mechanism that mimicks them (via training) will be afforded the same projection of the implicit. I do not believe there is, in fact, any non-destructive test of determining 'consciousness' in an entity. (Destructive, since there may be long running tests of a subject than can be shown to be probably accurate, possibly via creating situational problems involving survival, and unexpected circumstances.)

Ask yourself what is it that convinces you that the last person you spoke with (in real life) was actually conscious? I assert that the entire matter is a 'fictional certainty' based on assumption of shared nature. "They are conscious because I am".

tim3333y ago

I'm not sure many people take the Chinese Room argument seriously these days. I mean it basically proposes a room that is equivalent to a lookup table and says it can't understand much hence computers in general can't understand. The "hence..." bit makes no sense really.

dvt3y ago

> It seems to me to posit that to understand requires that the understandee is human.

Here's a thought experiment. Suppose we make first contact tomorrow, and we meet some intelligent aliens. What are some questions you would ask them? How would you decide on their sentience or understanding?

Sentience involves goal-seeking, understanding, sensory inputs, first-personal mental states (things like pain, happiness, sadness, depression, love, etc.), a sense of what philosophers like Elizabeth Anscombe call I-ness, etc. Most of this stuff, to me, seems like is language-agnostic. Even a baby that can't speak feels pain or happiness. Even a dog feels anxiety or affection.

LLMs are a cute parlor trick, but a phantasm nonetheless.

iliane53y ago

There's no denying LLMs are anything but sentient however is sentience really needed for intelligence? I feel like if we can have machines that are X% smarter than a human could ever get for any given task, it'd be a much better outcome for us if they were not sentient.

lordnacho3y ago

Does that mean sentient beings need to have had the evolutionary history that people have had? After all why would aliens have feelings like we have feelings? Don't we have feelings because either it was useful to our ancestors, or they are a spandrel of something that was?

1 more reply

root_axis3y ago

> if it quacks like a duck and walks like a duck, and your only need is for it to quack and walk like a duck, then it doesn’t matter if it’s actually a duck or not for all intents and purposes.

This isn't true. If a plane flies like a bird and you only need it for flying it doesn't then follow that a plane is a bird "for all intents and purposes".

i-use-nixos-btw3y ago

The emphasis on “for all intents and purposes” is important. If you specify requirements you specify them for your intent and for your purpose.

Requiring that something fly and that something be a plane are two different things with only minor overlap. If all you require is something that flies, then a dragonfly matches your requirements exactly as much as an apache helicopter does.

barrysteve3y ago

I have tried multiple times to use Chatgpt to generate Unreal c++ code. It does not do.

It spits out class names for slate objects, that inherit from other slate objects. Chatgpt doesn't understand inheritance. It just guesses what might fit inside a parameter grouping, and never suggests something with the right class type.

For my use case, it has never quacked like a duck, so to speak. It never performed, the word that might cover the concept of generating output without understanding it.

We agree on the value of computers understanding versus performing... only as much you need understanding to make it perform.

Predicting words alone does not cut the mustard, some structural depth or validating maps or some new concept is needed to sure up the wild horsepower in ChatGPT.

It must understand/have structure, or at least use a crutch to get it over the finish line..

i-use-nixos-btw3y ago

Yes, in this instance I understand failings of today (though copilot has a much better hit rate, and at the moment it’s a great augmentation to coding if you treat it like an enthusiastic intern).

My question is about the future. The argument goes that a machine can never understand Chinese, even if it is capable of interpreting Chinese and responding to or acting on the input perfectly every time. My reply is that, if it acts as if it understands Chinese in every situation, then there’s no measurable way of distinguishing it from understanding.

It’s kind of like the whole string theory vs SUSY vs… argument in physics. If the only outcomes are things that agree with the Standard Model in all measurable aspects, and don’t provide any measurable distinction, then for all intents and purposes they don’t matter. That’s why their active areas of research are looking for the measurable distinctions.

FWIW, supersymmetry models predict measurable things (that so far have only ruled out those models when tested) but have applications elsewhere. String theory research has had implications in mathematics, condensed matter, and a bunch of other places. They’re useful.

But that’s beside the point, because the premise of the Chinese room problem is that there exists a machine that passes all scenarios, where no measurable difference can be found, and that this machine does not understand Chinese.

missingrib3y ago

> My reply is that, if it acts as if it understands Chinese in every situation, then there’s no measurable way of distinguishing it from understanding.

I'm not sure if you understood the argument. The argument isn't asserting that there is a measurable way of distinguishing it, it's actually claiming that regardless of how well it seems like it understands Chinese, it doesn't actually understand Chinese. It's about intentionality and consciousness.

2 more replies

barrysteve3y ago

It's hard to keep this theoretical. Yes a machine is just a machine.

Defining a machine to be conscious, allows the individual to soak their mind in code and silicon as a receptacle for their spirit.

It creates a pull into a 'second mind'. Anybody who believes this is likely to invest heavily in the maintenance of new technology.

A 'conscious machine', creates an uneasy feeling that we should work to embed our spirit, knowledge, intellect into flipped bits, like expectant mothers. That we should work for the machine, and to the ends of the machine.

And that machine is somehow defined-to-be or a naturally, consciously alive (to a large or small degree). It is said to have a mind worthy of a person's professional output and it can hold the power of a marginally believable conversation.

While all of these described properties are vaguely plausible, it does nothing to help me understand the meaning of a technology, and only benefits those looking to create a fevor around a new tech product.

Describing chatgpt as a stochastic parrot or chinese room grants me a metaphor or analogy for the inner functions of the tech. It also lets me see, or otherwise guesstimate the products abilities clearly, without the belief-as-marketing hype.

I can take the stochastic parrot metaphor, to an article about LLMs and understand in a couple of days what took years of research to create.

Following the belief of computing as real human intelligence and that human intelligence is fundamentally mathematical, requires on some level submission of your mind to a machine that has it's own goals programmed in by someone else.

This centuries-long process of trying to encode and store all human knowledge behind the secure walls of complex coded signs.. and it's advocates for that process, create a subtle and deep twinge of future melancholy or dread or something. The idea that all written/typed meaning will be accessible only by the spiritual power brokers, and not our sons.

No. On some level, machines are just machines, like an abacus or a weaving loom. It can host concepts in the same way that a weaving loom is 'intelligent'. It holds it's shape, abstractions and functions by the laws of physics/metaphysics and according to my human dictates.

You follow the raven into the computers-are-conscious dream at your own risk. Computers are leaning towards controlling people rather than emancipating them. Leaning very hard in that direction. Do we want that? Freedom of mind and meaning is valueable.

cgh3y ago

Yeah, I asked it to write some relatively simple Zig and it produced a weird mix of Zig and C. It just made stuff up, fake it until you make it style.

lsy3y ago

I don't know if I take the Chinese Room scenario seriously, it's a little hard to think about. For me the key is that an LLM (and maybe the Chinese room?) is only trained on text, and therefore its entire "universe" is merely representational. To the LLM there is no difference between the color blue and the word "blue", or a dog and the word "dog". People understand that a word is a provisional marker for something with almost infinite complexity, and when a word fails to meet the goals of the person, the word is discarded, amended, or elaborated until it expresses the thing which has heretofore been unexpressed. For an LLM there is no deeper world to access beyond text.

I don't think an intelligence needs to be human, and it should be physically possible to create an intelligence which is synthetic. But in order to call the intelligence "general", and to rely on it for the purposes that designation implies, it would need to be able to successfully navigate the world, which requires access to that world and the use of the world as its own model, rather than the much simpler and coarser intermediary of text. In order to claim that an LLM can fully navigate the world after being trained on pure text, we would have to believe that all our writings across history have exhausted what there is to say about the world. This is not to say an LLM cannot be useful for some purposes, but there will be key ways in which they fail because they have no sense of meaning or what the world is like. Whether consciousness is required to solve this I don't know, but we simply haven't begun to approach a system that can meaningfully address the world as a world.

d0mine3y ago

Here's GPT output that demonstrates that it doesn't reflect on its text: «GPT Chat: You are right, I apologize for the confusion in my previous responses. The corrected explanation is:

The pattern is counting the number of closed spaces in each letter of the spelled-out number. A closed space is any enclosed space in a letter, such as in the letters "a", "b", "d", "e", etc.

Following the pattern:

- one -> 2 (there are closed spaces in the letters "n" and "e") - two -> 1 (there is a closed space in the letter "o") - three -> 2 (there are closed spaces in the letters "h" and "e") - four -> 1 (there is a closed space in the letter "o") - five -> 1 (there is a closed space in the letter "e") - six -> 0 (there are no closed spaces in the letters) - seven -> 2 (there are closed spaces in the letters "e" and "n") - eight -> 1 (there is a closed space in the letter "g") - nine -> 1 (there is a closed space in the letter "e") - ten -> 1 (there is a closed space in the letter "b") - eleven -> 3 (there are closed spaces in the letters "e", "l", and "v") - twelve -> 2 (there are closed spaces in the letters "b" and "d") - thirteen -> 2 (there are closed spaces in the letters "b" and "d")

Each item follows the pattern, as the number of closed spaces in their letters matches the corresponding number in the pattern.

The whole sequence is:

one -> 2 two -> 1 three -> 2 four -> 1 five -> 1 six -> 0 seven -> 2 eight -> 1 nine -> 1 ten -> 1 eleven -> 3 twelve -> 2 thirteen -> 2 ...»

It is clear the model doesn't know what it is talking about.

int_19h3y ago

Anything having to do with shapes or spatial relationships seems to befuddle GPT-3.5. GPT-4 is significantly better but it's clear that it still struggles with it. As far as shapes of the letters in particular go, you can ask it to draw them using SVG or <canvas> and see what comes out.

Here's a different example involving dataset analysis with GPT-4 that required it to analyze its own previous outputs to find and correct mistakes and form a new hypothesis:

https://gist.github.com/int19h/cd1d1598f91e8ba92dd8e80bd5d21...

rwmj3y ago

Isn't the whole point of Searle raising the Chinese Rooom argument exactly so that these kinds of things can be debated?

jacksnipe3y ago

This is also my interpretation of the Chinese Room.

Norvig and Chomsky really got into this type of argument, though maybe it’s a stretch to say it’s this exact one; see Norvig’s side here: https://norvig.com/chomsky.html

JieJie3y ago

I literally lost a friend of thirty years yesterday because she is wedded to the Chinese Room analogy so fiercely, she refuses to engage on the subject at all.

For all the terrible things people worry about ChatGPT doing, this was not one that I thought I was going to have to deal with.

(edit: ChatGPT was not involved at all, but when I suggested she give it a try to see for herself, that was the end of it.)

brazzy3y ago

How can an disagreement over GPT possibly be something to end a friendship over?

This sounds like you said "I cannot possibly be friends with someone who does not believe that LLMs are emerging AGI!", and people read it like that and are downvoting you.

I'm gonna assume the situation was more complex, but still find it hard to imagine, how a disagreement over such an academic topic could end up destroying a friendship.

JieJie3y ago

It was a complex situation to be sure.

I only shared the story to illustrate how personally people are taking these discussions. I really felt like I was being very neutral and just sharing my enthusiasm. It was entirely unwelcome, apparently.

If there's a lesson to be learned it's that people's tempers over these issues may be hotter than they appear.

I can barely speak with my artist friends on the issue these days due to their generative AI fears. Their emotions are completely intractable on the subject: AI art is theft. Period.

2 more replies

bulkprotocol3y ago

You blew up a 30 year friendship over an...analogy?

JieJie3y ago

I didn't! Someone else did it to me. I was trying desperately not to.

(edit: This is the kind of stuff I think my friends are watching and being informed by [0] as it was what they are posting in our common areas.)

[0]: https://youtu.be/ro130m-f_yk

1 more reply

user_named3y ago

You're talking about intelligence - GPT doesn't quack at all. It clearly is not a duck.

onetrickwolf3y ago· 43 in thread

I've been using GPT4 to code and these explanations are somewhat unsatisfactory. I have seen it seemingly come up with novel solutions in a way that I can't describe in any other way than it is thinking. It's really difficult for me to imagine how such a seemingly simple predictive algorithm could lead to such complex solutions. I'm not sure even the people building these models really grasp it either.

simonw3y ago

I've started to suspect that generating code is actually one of the easier things for a predictive text completion model to achieve.

Programming languages are a whole lot more structured and predictable than human language.

In JavaScript the only token that ever comes after "if " is "(" for example.

exitb3y ago

On the other hand, if you want to use an external library on the line 80, you need to import it at the top.

I once asked it for a short example code of something, no longer than 15 lines and it said "here's a code that's 12 lines long" and then added the code. Did it have the specific code "in mind" already? Or was it just a reasonably-sounding length and it then just came up with code that matched that self-imposed constraint?

matjet3y ago

The latter option is closest, but neither is quite right. It would have ~known~ that the problem asked, combined with a phrase for a 15 line limit has associations with a length of 12 lines (perhaps most strongly 12, but depending on temp it could have given other answers). From there it is constrained to (complete) solutions that lead to 12 lines, from the several (partial) solutions that already exist in the weights.

twobitshifter3y ago

I loved your example. I think that may be an obvious advantage to LLM, humans are poor at learning new languages after adolescence but a LLM can continue to learn and build new connections. Studies show that multilingual people have an easier time making connections and producing new ideas, In the case of programming, we may build something that knows all programming languages and all design patterns and can merge this knowledge to come up with better solutions than the ordinary programmer.

pen2l3y ago

The more constraints there are (e.g. like your example) the better it should perform. So it disappoints me when copilot, knowing what libraries are available in the IDE it's running in, hallucinates up a method call that doesn't exist.

Separately (and apologies for going on a tangent), where do you think we are in the Gartner cycle?

Around GPT3 time I was expecting for trough of disillusionment to come, particularly when we see the results of it being implemented everywhere but it hasn't really come yet. I'm seeing too many examples of good usage (young folks using it for learning, ESL speakers asking for help and revisions, high-level programmers using it to save themselves additional keystrokes, the list is long).

justrealist3y ago

> hallucinates up a method call that doesn't exist

I actually think it helps to reframe this. It hallucinates up a method call that predictively should exist.

If you're working with boto3, maybe that's not actually practical. But if it's a method within your codebase, it's actually a helpful suggestion! And if you prompt it with the declaration and signature of the new method, very often it will write the new helper method for you!

2 more replies

danenania3y ago

Are you using it with static types at all? With TypeScript, I've found that it's quite good at producing the imperative logic, but can struggle with types once they reach a certain level of abstraction. It's interesting that even in the realm of "structured languages", it's a lot stronger at some kinds of inference than others.

capableweb3y ago

> In JavaScript the only token that ever comes after "if " is "(" for example.

I'm pretty sure " " (whitespace) is a token as well, which could come after a `if` as well. I think overall your point is a pretty good one though.

fl73053y ago

> I've started to suspect that generating code is actually one of the easier things for a predictive text completion model to achieve.

> Programming languages are a whole lot more structured and predictable than human language.

> In JavaScript the only token that ever comes after "if " is "(" for example.

But isn't that like saying that it's easy to generate English text, all you need is a dictionary table where you randomly pick words?

(BTW, keep up the blog posts, I really enjoy them!)

int_19h3y ago

One thing to bear in mind is that GPT training set for code is supposedly skewed very heavily towards Python.

camillomiller3y ago

This!

EForEndeavour3y ago

The advanced capabilities of scaled up transformer models fed oodles of training data has burdened me with pseudo-philosophical questions about the nature of cognition that I am not well equipped to articulate, and make me wish I'd studied more neuroscience, philosophy, and comp sci earlier in life. A possibly off-topic thought dump:

- What is thinking, exactly?

- Does human (or superhuman) thinking require consciousness?

- What even is consciousness? Why is it that when you take a bunch of molecular physical laws and scale them up into a human brain, a signal pattern emerges that feels things like emotions, continuity between moments, desires, contemplation of itself and the surrounding universe, and so on?

- Why and how does a string predictor on steroids turn out to do things that seem so close to a practical definition of thinking? What are the best evidence-based arguments supporting and opposing the statement "GPT4 thinks"? How do people without OpenAI's level of model access try to answer this question?

(And yes, it's occurred to me that I could try asking GPT4 to help me make these questions more complete)

Workaccount23y ago

> has burdened me with pseudo-philosophical questions about the nature of cognition that I am not well equipped to articulate, and make me wish I'd studied more neuroscience, philosophy, and comp sci earlier in life

Welcome to the club. There pretty much are no answers, just theories primarily played out as thought experiments. Its on of those areas where you can pick out who knows less (or is being disingenuous) by seeing who most confidently speaks about having answers.

We don't know what consciousness is, and we don't know what it means to "think". There, I saved you a decade of reading.

Edit: My choice theory is panpsychism, https://plato.stanford.edu/entries/panpsychism/ but again, we don't yet know how to verify any of this (or any other theory).

ryanwaggoner3y ago

It's interesting to me how many commenters on HN are absolutely convinced that GPT4 is incapable of thought or understanding or reasoning, it's "just" predicting the next word. And then they'll insist that it'll never be able to do things that it's already capable of doing...

Interestingly, more than one of these folks have turned out to be religious. I wonder if increasingly intelligent AI systems will be challenging for religious folks to accept, because it calls into question our place at the pinnacle of God's creation, or it casts doubt upon the existence of a soul, etc.

4 more replies

nicpottier3y ago

The hypothesis that I find most compelling and intuitive is that language is thought and vice versa. We made a thing really good at language and it turns out that's also pretty good at thought.

One possible conclusion might be that the only thing keeping GPT algos from going full AGI is a loop and small context windows.

robwwilliams3y ago

Add the strange loops and embed in a body the interacts with a real or rich virtual word—that should do the trick. Of course there should ideally be an emotional-motivational context.

narag3y ago

- Does human (or superhuman) thinking require consciousness?

I was going to write this exactly. I believe these things think. They're just not alive.

- What even is consciousness?

My advice: stay as far as you can from that concept. Wittgenstein already noticed that many philosophical questions are nonsense and specifically mentioned how consciousness as felt from the inside is hopefully incompatible with any observation we make from the outside.

BS concepts like qualia are all the rage now, but ultimately useless.

HarHarVeryFunny3y ago

My views:

The best definition of "intelligence" is "the degree of ability to correctly predict future outcomes based on past experience".

Our cortex (part of the brain used for cognition/thinking) appears to be literally a prediction engine where predicted outcomes (what's going to happen next) are compared to sensory reality and updated on that basis (i.e. we learn by surprise - when we are wrong). This makes sense as an evolutionary pressure since ability to predict location of food sources, behavior of predators, etc, etc, is obviously a huge advantage over being directly reactive to sensory input in the way that simpler animals (e.g. insects) are.

I'd define consciousness as the subjective experience of having a cognitive architecture that has particular feedback paths/connections. The fact that there is an architectural basis to consciousness would seem to be proved by impairments such as "blindsight" where one is able to see, but not conscious of that ability! (eg. ability to navigate a cluttered corridoor, while subjectively blind).

It doesn't seem that consciousness is a requirement for intelligence ("ability to think"), although that predictive capability can presumably benefit from more information, so these feedback paths may well have evolutionary benefit.

The reason a "string predictor on steroids" turns out to be able to do things that seem like thinking is because prediction is the essence of thinking/intelligence! Of course there's a lot internally missing from GPT-4 compared to our brain, for example basics like working memory (any internal state that persists from one output word to the next) and looping/iteration, but feeding it's own output back in does provide somewhat of a substitute for working memory, and external scripting/looping (AutoGPT, etc) goes a long way too.

kingkongjaffa3y ago

I think since the mechanisms are different we should arrive at a distinction between:

organic thinking (I.e. the process our squishy human brains do)

and mechanical thinking ( the computational and stochastic processes that computers do ).

TuringTest3y ago

I don't think the substrate defines the nature of the thinking, but the form of the process does.

It is entirely possible to build mechanical thinking in organic material (think Turing machines built on growing tissue), and it could also be possible to build complex self-referential processes simulated on electronic hardware, of the kind high-level brains do, with their rhythms of alfa and beta waves.

coding1233y ago

> What even is consciousness? Why is it that when you take a bunch of molecular physical laws and scale them up into a human brain, a signal pattern emerges that feels things like emotions, continuity between moments, desires, contemplation of itself and the surrounding universe, and so on?

I doubt we'll ever be able to answer this, even after we create AGI.

HarHarVeryFunny3y ago

Any overly simple "it's just predicting next word" explanation is really missing the point. It seems more accurate to regard that just as the way they are trained, rather than characterizing what they are learning and therefore what they are doing when they are generating.

There are two ways of looking at this.

1) In order to predict next word probabilities correctly, you need to learn something about the input, and the better you want to get, the more you need to learn. For example, if you just learned part-of-speech categories for words (noun vs verb vs adverb, etc), and what usually follows what, then you would be doing better than chance.. If you want to do better than that they you need to learn the grammar of the underlying language(s).. If you want to do better than that then you start to need to learn the meaning of what is being discussed, etc, etc.

If you want to correctly predict what comes next after "with a board position of ..., Magnus Carlson might play", then you better have learned a whole lot about the meaning of the input!

The "predict next word" training objective and feedback provided doesn't itself limit what can be learned - that's up to the power of the model that is being trained, and evidentially large multi-layer transformers are exceptionally capable. Calling these huge transformers "LLMs" (large language models) is deceptive since beyond a certain scale they are certainly learning a whole lot more than language/grammar.

2) In the words of one of the OpenAI developers (Sutskever), what these models have really learnt is some type of "world model" modelling the underlying generative processes that produced the training data. So, they are not just using surface level statistics to "predict next word", but rather are using the (often very lengthy/detailed) input prompt to "get into the head" of what generated that, and are predicting on that basis.

ben_w3y ago

To be deliberately unfair, imagine a huge if-else block — like, a few billion entries big — and each branch played out a carefully chosen and well-written string of text.

It would convince a lot of people with the breadth, despite not really having much depth.

The real GPT model is much deeper than that, of course, but my toy example should at least give a vibe for why even a simple thing might still feel extraordinary.

myrmidon3y ago

This is absolutely not viable because exponential growth absolutely kills the concept.

Such a system would already struggle with multiple-word inputs and it would be completely impossible to make it scale to even a paragraph of text, even if you had ALL of the observable universe at your disposal for encoding the entries.

Consider: If you just have simple sentences consisting of 3 words (subject, object, verb, with 1000 options each-- very conservative assumptions), then 9 sentences already give more options than you have atoms (!!) in the observable universe (~10^80)

ben_w3y ago

α: most of those sentences are meaningless so they won't come up in normal use

β: if statements can grab patterns just fine in most languages, they're not limited to pure equality

γ: it's a thought experiment about how easy it can be to create illusions without real depth, and specifically not about making an AGI that stands up to scrutiny

1 more reply

cgearhart3y ago

It’s a fallacy to describe what the machine does as “thinking” because that’s only process you know for achieving the same outcome.

When you initiate the model with some input where you expect some particular correct output, that means there exists some completed sequence of tokens that is correct—if that weren’t true then you either wouldn’t ask or else you wouldn’t blame the model for being wrong. Now imagine a machine that takes in your input and in one step produces the entire output of that correct answer. In all nontrivial cases there are many more _incorrect_ possible outputs than correct ones, so this appears to be a difficult task. But would you say such a machine is “thinking”? Would you still consider it thinking if we could describe the process mathematically as drawing a sample from the output space; that it draws the correct sample implies it has an accurate probability model of the output space conditioned on your input. Does this require “thought”?

GPT is just like this machine except that instead of one-step, the inference process is autoregressive so each token comes out one at a time instead of all at once. (Note that BERT-style transformers _do_ spit out the whole answer at once.)

It’s possible that this is all that humans do. Perhaps we are mistaken about “thinking” altogether—perhaps the machine thinks (like a human), or perhaps humans do not think (like the machine). In either case I do feel confident that human and machine are not applying the same mechanism; jury is still out whether we’re applying the same process.

int_19h3y ago

Now consider the case when you tell GPT to "think it out loud" before giving you the answer - which, coincidentally, is a well-known trick that tends to significantly improve its ability to produce good results. Is that thinking?

1 more reply

robotresearcher3y ago

Perhaps it’s more productive to go the other direction and consider how the concept of ‘thinking’ could be reconsidered.

It’s not like we all agree on what thinking is. We never have. It may not even be one thing.

LeSaucy3y ago

I have only seen gpt generate imperative algorithms. Does it have the ability to work with concurrency and asynchrony?

z3c03y ago

I've attempted to pose a concurrency problem to GPT4. The output was invalid code, though likely would have looked correct to the untrained eye. It was only after I spelled out the limitations that it could account for them.

k__3y ago

I tried point free solutions, which threw it off.

lm284693y ago

Care to post a full example ?

simonw3y ago

I used GPT-4 to build this tool https://image-to-jpeg.vercel.app using a few prompts the other day - my ChatGPT transcript for that is here: https://gist.github.com/simonw/66918b6cde1f87bf4fc883c677351...

lm284693y ago

See my problem with virtually every single example is that we talk about "I can't describe in any other way than it is thinking", "such complex solutions" but in the end we get a 50 lines "app" that you'd see in a computer science 101 class

It's very nice, it's very impressive, it will help people, but it doesn't align with the "you're just about to lose your job" "Skynet comes in the next 6 months" &c.

If these basic samples are a bottleneck in your day to day life as a developer I'm worried about the state of the industry

2 more replies

camillomiller3y ago

Love how you didn’t care about styling this like at all, Lol. Btw, if you ask gpt to make it presentable by using bootstrap 5 for example it can style it for you

1 more reply

ZeroGravitas3y ago

I'm assuming the bits that say

> // Rest of the code remains the same

Are exactly as generated by GPT-4, i.e. it knew it didn't need to repeat the bits that hadn't changed, and knew to leave a comment like this to indicate that to the user.

It gets confusing when something can fake a human so well.

1 more reply

samstave3y ago

What is the time-spent for delta btwn fixing GPT code to writing it all yourself? Is it a reasonable scaffold that will grow over time?

agentultra3y ago

It's not thinking, plain and simple.

Anything it generates means nothing to the algorithm. When you read it and interpret what was generated you're experiencing something like the Barnum-Forer effect. It's sort of like reading a horoscope and believing it predicted your future.

myrmidon3y ago

What gives you any confidence that the way GPT4 comes up with answers is qualitatively different from humans?

Why should the emulation of human though, a result of unguided evolution, require anything more than properly wired silicon?

agentultra3y ago

That's highly reductive of our capacities. We are not weighted transformers that can be explained in an arxiv paper. GPT, at the end of the day, is a statistical inference model. That's it.

It's not going to wake up one day, decide it prefers eggs benny and has had enough of your idle chatter because of that sarcastic remark you made last week.

Could we simulate a plausibly realistic human brain on silicon someday? I don't know, maybe? But that's not what GPT is and we're no where near being able to do that.

You can scale up the tokens an LLM can manage and all you get is a more accurate model with more weights and transformers. It's not going to wake up one day, have feelings, religion, decide things for itself, look in a mirror and reflect on its predicament, lament the poor response it gave a user, and decide it doesn't want to live with regret and correct its mistakes.

2 more replies

TuringTest3y ago

> What gives you any confidence that the way GPT4 comes up with answers is qualitatively different from humans?

For a start, GPT-4 doesn't include in its generation the current state of its internal knowledge used so far; any text built can only use at most the few words already generated in the current session as a kind of short-term memory.

Biological brains OTOH have a rhythm with feedback mechanisms which adapt to the situation where they're doing the thinking.

1 more reply

fnordpiglet3y ago

Except for when as an expert in a field you ask it questions about that are subtle and it answers in a cogent and insightful way, and as an expert you are fully aware of that. It’s not reasonable to call that a Barnum-Forer effect. It’s perhaps not thinking (but perhaps we need to more clearly define thinking), but its not a self-deception either.

m3kw93y ago

What’s novel to you could be just trained material

seydor3y ago· 20 in thread

This is confusing, using the semantic vectors arithmetic of embeddings is not very relevant to transformers and its completely missing the word 'attention'. I don't think transformers are that difficult to explain to people , but it is hard to explain "why" they work. But i think it's important for everyone to look under the hood and know that there are no demons underneath.

nitnelaveOP3y ago

I was trying to keep the article at a level that everyone understands, from middle school up. I thought about going a bit deeper in the structure and mentioning attention, but my problem is that the intuitive concept of "attention" is quite different from the mathematical reality of an attention layer, and I'm sure I would have lost quite a few people there. It's always a trade-off :)

seydor3y ago

Indeed the word 'attention' implies some active process, which is also unfortunate, because it is really similarity maps over long ranges. This process apparenty creates a cross-reference maps of items that are commonly co-activated. But then what do the different heads do? Some kind of different "modalities of cross-reference"?

I struggle to understand why this thing works the way it does. It's possible that Vaswani et al. have made one of the greatest discoveries of this century that solved the language problem in an unintuitive, and yet very unappreciated way. It's also possible that there are other architectures that can simulate the same level of intelligence with such large numbers of parameters.

I think you re right that it's not intuitive, it's like basic arithmetic is laughing at us

EMM_3863y ago

> I struggle to understand why this thing works the way it does.

I'm not in this field but have recently found myself going on the deepest dive possible into it as my small brain can absorb.

I now know about (on a surface level) neural networks, transformers, attention mechanisms, vectors, maticies, tokenization, loss functions and all sorts of other crazy stuff.

I come out of this realizing that there are some incredibly brilliant minds behind this. I knew AI was a complex subject but not on the level I've learned about now. To get what is essentially matrix multiplications to learn complex patterns and relationships in language is mind-blowing.

And it's creative. It can have a rap battle with an alter-ego, host a quiz party with other AIs of varying personalities, co-author a short story with me, respond to me only in emojis. The list is seemingly endless. Oh, and it can also do useful things. It's my programming companion too.

And we're just getting started.

1 more reply

fastball3y ago

Multi-head attention just means that you're looking at all the words at once rather than only looking at one word at a time, and using that to generate the next word. So instead of using attention only on the last word you also have attention on the penultimate word and the one before that and the one before that, etc. I think it is fairly obvious why this gives better results than say an RNN – you are utilizing context better than in a recurrent system, which is also just closer to how a human brain works. When you read/write a sentence you're not really going one word at a time, you're thinking about all the words at once, even if the last word is technically the most important.

The other clear benefit of transformers over an arch like RNNs (and what has probably made more of a difference imo) is that its properly parallelizable, which means you can do huge training runs in a fraction of the time. RNNs might be able to get to a level of coherence that approaches GPT-3, but with current hardware that would be very time-prohibitive.

2 more replies

photochemsyn3y ago

No worries, readers can just ask ChatGPT 'what's the meaning of attention in the context of LLM design' and go from there [edit also temperature].

masswerk3y ago

> that there are no demons underneath

Given that Lacan already proposed the unconscious as structured language-like more than half a century ago and described attention in his turn on Freud's impulse in favor of his concept of derive, we may say, this is pretty much where our own demons live.

(I actually do think that revisiting Lacan in this context may be productive.)

thanatropism3y ago

We need to form some sort of guild of engineers who think Deleuze, Latour, Lacan et caterva should be read within our disciplines.

Der_Einzige3y ago

They would be called "unemployable" because they are literal charlatans.

Please end our strange fascination with fashionable nonsense. Freud was wrong. There is no Oedipus complex. Everything lacan proposed was wrong. Deleuze and Guattari's mental health clinic failed spectacularly, and Deleuze ended up killing himself at the end (supposedly due to back pain?)

They literally describe their thought as being "Schizoanalysis". How many more red flags do you need?

Also, the more "modern" takes on this from techno folks, such as from Nick Land (Fanged Noumena), are openly fascist - https://en.wikipedia.org/wiki/Dark_Enlightenment

If you want cultural critique from smart people without it turning into fashionable nonsense, I recommend Mark Fischer, but be warned, he too killed himself.

2 more replies

masswerk3y ago

I second this. The STEM (= engineering) / humanities divide in software may miss some.

(There had been times, when linguistics were still a major entry path into computing, where things were a bit different. Notably, this were also the times, which gave rise to most of the general paradigms. A certain amount of generality was even regarded a prerequisite to programming. Particularly, HN is such a great place, because it holds up this notion of generality.)

robwwilliams3y ago

Please no! Read systems neuroscience. Like Hassabis does. Or if of a philosophical persuasion, then Dennett or Rorty.

1 more reply

ttpphd3y ago

In this moment it would be better if people would read some Wittgenstein and Roy Harris, IMHO.

1 more reply

robwwilliams3y ago

I would vote for revisiting Hofstadter and perhaps even Piaget. And if we want to bother trying to define and operationalize self-consciousness—-then Guillery’s The Brain as a Tool—is a good foray into how the CNS pulls off this simple trick.

gcr3y ago

Whether there are demons underneath depends on whether you believe a lossy statistical snapshot across the Jungian subconscious of all human writing contains demons.

It's a bit more plausible when we phrase it that way...

dado32123y ago

Is there an article that would meet your standard (discusses transformers, uses the word 'attention') that you could give to someone who has no familiarity with computer science?

seydor3y ago

Jay alammar's Illustrated transformer, although this too is detailed. I think it's still worth taking a look, because really i don't think that people have yet "compressed" what transformers do intuitively. None of the concepts of the networks are particularly hard math - it's basic algebra. But the overall construction is complicated.

https://jalammar.github.io/illustrated-transformer/

int_19h3y ago

The whole thing is magic, really. We assemble relatively mundane things that we understand together in certain specific ways, and somehow get something that can talk etc. A conjurer of old - who would similarly assemble bells and candles - would probably say that we summoned a demon.

Analog243y ago

Embeddings and their relationship to each other are definitely relevant to transformers. Why do you think that's not the case?

seydor3y ago

gptX embeddings aren't even words. Even so, the embedding relationship is useful but not the core of what transformers do to find relationships between words in sequences.

gcr3y ago

remember the word2vec paper? the surprising bit the authors were trying to show was that putting words in some embedding space with an appropriate loss naturally lends enough structure to those words to be able to draw robust, human-interpretable analogies.

I agree with the sentiment that each individual dimension isn't meaningful, and I also feel like it's misleading for the article to frame it that way. But there's a grain of truth: the last step to predicting the output token is to take the dot product between some embedding and all the possible tokens' embeddings (we can interpret the last layer as just a table of token embeddings). Taking dot products in this space are equivalent to comparing the "distance" between the model's proposal and each possible output token. In that space, words like "apple" and "banana" are closer together than they are to "rotisserie chicken," so there is some coarse structure there.

Doing this, we gave the space meaning by the fact that cosine similarity is meaningful proxy for semantic similarity. Individual dimensions aren't meaningful, but distance in this space is.

A stronger article would attempt to replicate the word2vec analogy experiments (imo one of the more fascinating parts of that paper) with GPT's embeddings. I'd love to see if that property holds.

1 more reply

Analog243y ago

You can't just wave your hand and tell someone that words are broken up into sub-word tokens that are then transformed into a numerical representation to feed to a transformer and expect people to understand what is happening. How is anyone supposed to understand what a transformer does without understanding what the actual inputs are (e.g. word embeddings)? Plus, those embeddings directly related to the self attention scores calculated in the transformer. Understanding what an embedding is is extremely relevant.

ZeroGravitas3y ago· 11 in thread

> It is able to link ideas logically, defend them, adapt to the context, roleplay, and (especially the latest GPT-4) avoid contradicting itself.

Isn't this just responding to the context provided?

Like if I say "Write a Limerick about cats eating rats" isn't it just generating words that will come after that context, and correctly guessing that they'll rhyme in a certain way?

It's really cool that it can generate coherent responses, but it feels icky when people start interrogating it about things it got wrong. Aren't you just providing more context tokens for it?

Certainly that model seems to fit both the things it gets right, and the things it gets wrong. It's effectively "hallucinating" everything but sometimes that hallucination corresponds with what we consider appropriate and sometimes it doesn't.

danenania3y ago

It's all about emergent complexity. While you can reduce it to "just" statistical auto-completion of the next word, we are seeing evidence of abstraction and reasoning produced as a higher-order effect of these simple completions.

It's a bit like the Sagan quote: "If you wish to make an apple pie from scratch, you must first invent the universe".

Sometimes for GPT to "just" complete the next word in a way that humans find plausible, it must, along the way, develop a model of the world, theory of mind, abstract reasoning, etc. Because the models are opaque, we can't yet point to a certain batch of CPU cycles and say "there! it just engaged in abstract reasoning". But we can see from the output that to some extent it's happening, somehow.

We also see effects like this when looking at collective intelligence of bees and ants. While each individual insect is only performing simple actions with extremely limited cognitive processing, it can add up to highly complex and intelligent/adaptive mechanics at the level of the swarm. There are many phenomena like this in nature.

tines3y ago

> Sometimes for GPT to "just" complete the next word in a way that humans find plausible, it must, along the way, develop a model of the world, theory of mind, abstract reasoning. etc.

I did an experiment recently where I asked ChatGPT to "tell me an idea [you] have never heard before". ChatGPT replied with what sounded like an idea for a startup, which was delivering farm-fresh vegetables to customers' doors. This is of course not an idea it has never heard before, it's on the internet.

If you asked a human this, they would give you an idea they had never heard before, whereas ChatGPT simply "finds" training data where someone asked a similar question, and produces the likely response, which is an idea that it has actually "heard," or seen in its training data, before. (Obviously a gross simplification of the algorithm but the point stands.)

This is a difference between ChatGPT's algorithm and human reasoning. The things that you mention, the model of the world, theory of mind, etc. are statistical illusions which have observable differences from the real thing.

Am I wrong? I'm open to persuasion.

4 more replies

squeaky-clean3y ago

Turns out good usage of "language" requires a model of the world in which that language exists. "The purple, two eyed, green, five eyed, invisible frog said moo" is a grammatically fine sentence. But logically it makes no sense, does it have two eyes or five? Is it green or purple or invisible? Frogs don't typically say moo. To have actual coherent usage of language, you need a model of the world. Not just the world, but the current domain you're using language in. "The frog brainwashed the crowd with its psychic powers" is nonsense in a biology paper, but perfectly valid inside of the cartoon Futurama.

In ChatGPT the language-model and world-model are really just the same model, which makes a lot of sense.

1 more reply

ttpphd3y ago

This is really lofty language without much evidence to back it up. It fluffs up techie people and makes them feel powerful, but it doesn't really describe large language models nor does it describe linguistic processes.

danenania3y ago

The evidence is ChatGPT's output. Unless you're saying that passing the bar exam, writing working code, etc. doesn't require abstract reasoning abilities or a model of the world?

1 more reply

IshKebab3y ago

> Like if I say "Write a Limerick about cats eating rats" isn't it just generating words that will come after that context, and correctly guessing that they'll rhyme in a certain way?

Aren't you just doing that?

int_19h3y ago

"Responding to the context provided" is very vague. I could argue that I'm doing exactly that right now as I'm writing this comment. It does not imply not being able to e.g. link ideas logically.

With respect to interrogating GPT if it does something wrong - the reason why people do it is because it works. With GPT-4 especially, you can often ask it to analyze its own response for correctness, and it will find the errors without you explicitly pointing them out. You can even ask it to write a new prompt for itself that would minimize the probability of such errors in the future.

samstave3y ago

There once was a Cat in New York

Who got caught for feeding some Rats ; Tremendous Work!

All the people tell me, many men, biggly men - many with tears in their eyes...

That I have done nothing legally-wise

But the truth is ; I am an enormous dork.

>>_Created by an actual Human Being with actual DNA for crime scene evidence._

But just when they tried to brush under a rug

To try to make the folks 'shrug'

Is the Streisand Effect as a scar

As everyone knows of payments to a Porn Star

And the nation will know youre a simple thug.

samstave3y ago

There once was a man in New York

Guilty of paying too much for pork

He thought he would never stand

on a trial from the local grand

but corruption was just part of the work.

SilasX3y ago

>Like if I say "Write a Limerick about cats eating rats" isn't it just generating words that will come after that context, and correctly guessing that they'll rhyme in a certain way?

I guess ... this is what confuses me. GPT -- at least, the core functionality of GPT-based products as presented to the end user -- can't just be a language model, can it? There must be vanishingly view examples from its training text that start as "Write a Limerick", followed immediately by some limerick -- most such poems do not appear in that context at all! If it were just "generating some text that's likely to come after that in the training set", you'd probably see some continuations that look more like advice for writing Limericks.

And the training text definitely doesn't have stuff like, "As a language model, I can't provide opinions on religion" that coincides precisely with the things OpenAI doesn't want its current product version to output.

Now, you might say, "okay okay sure, they reach in and tweak it to have special logic for cases like that, but it's mostly Just A Language Model". But I don't quite buy that either -- there must be something outside the language model that is doing significant work in e.g. connecting commands with "text that is following those commands", and that seems like non-trivial work in itself, not reasonably classified as a language model.[2]

If my point isn't clear, here is the analogous point in a different context: often someone will build an AND gate out of pneumatic tubes and say, "look, I made a pneumatic computer, isn't that so trippy? This is what a computer is doing, just with electronics instead! Golly gee, it's so impressive what compressed air is [what LLMs are] capable of!"

Well, no. That thing might count as an ALU[1] (a very limited one), but if you want to get the core, impressive functionality of the things-we-call-computers, you have to include a bunch of other, nontrivial, orthogonal functionality, like a) the ability read and execute a lot of such instructions, and b) to read/write from some persistent state (memory), and c) have that state reliably interact with external systems. Logic gates (d) are just one piece of that!

It seems GPT-based software is likewise solving other major problems, with LLMs just one piece, just like logic gates are just one piece of what a computer is doing.

Now, if we lived in a world where a), b), and c) were well-solved problems to point of triviality, but d) were a frustratingly difficult problem that people tried and failed at for years, then I would feel comfortable saying, "wow, look at the power of logic gates!" because their solution was the one thing holding up functional computers. But I don't think we're in that world with respect to LLMs and "the other core functionality they're implementing".

[1] https://en.wikipedia.org/wiki/Arithmetic_logic_unit?useskin=...

[2] For example, the chaining together of calls to external services for specific types of information.

pulvinar3y ago

I think you're really undervaluing the capabilities of language models. I would put an AND gate and this language model at opposite ends in terms of complexity. It is not just words, it's a very broad and deep hierarchy of learned all-encompassing concepts. That's what gives it its power.

agentultra3y ago· 11 in thread

A good article and well articulated!

I would change the introduction to be more impartial and not anthropomorphize GPT. It is not smart and it is not skilled in any tasks other than that for which it is designed.

I have the same reservations about the conclusion. The whole middle of the article is good. But to then compare the richness of our human experience to an algorithm that was plainly explained? And then to speculate on whether an algorithm can "think" and if it will "destroy society," weakens the whole article.

I really would like to see more technical writing of this sort geared towards a general audience without the speculation and science-fiction pontificating.

Good effort!

fl73053y ago

> it is not skilled in any tasks other than that for which it is designed.

But it wasn't designed. It's not a computer program, where one can make confident predictions about its limitations based on the source code.

It's a very large black box. It was trained on guessing the next word. Does that fact alone prove that it cannot have evolved certain internal structures during the training?

Do you claim that an artificial neural network with trillions of neurons can never be intelligent, no matter the structure?

Or is the claim that this particular neural network with trillions of neurons is not intelligent? If so, what is the reasoning?

> It is not smart

"Not smart" = "not able to reason intelligently".

Is that a falsifiable claim?

What would the empirical test look like that would show us if the claim is correct or not?

Look, I realize that "GPT-4 is intelligent" is an extraordinary claim that requires extraordinary evidence.

But I think we're starting to see such extraordinary evidence, illustrated by the examples below.

https://openai.com/research/gpt-4 (For instance, the "Visual inputs" section)

Microsoft AI research: Many convincing examples, summarized with:

"The central claim of our work is that GPT-4 attains a form of general intelligence, indeed showing sparks of artificial general intelligence.

This is demonstrated by its core mental capabilities (such as reasoning, creativity, and deduction), its range of topics on which it has gained expertise (such as literature, medicine, and coding), and the variety of tasks it is able to perform (e.g., playing games, using tools, explaining itself, ...)."

https://arxiv.org/abs/2303.12712

agentultra3y ago

> It's a very large black box. It was trained on guessing the next word. Does that fact alone prove that it cannot have evolved certain internal structures during the training?

Yes. There is interesting work to formalize these black boxes to be able to connect what was generated back to its inputs. There’s no need to ascribe any belief that they can evolve, modify themselves, or spontaneously develop intelligence.

As far as I’m aware no man made machine has ever exhibited the ability to evolve.

> Do you claim that an artificial neural network with trillions of neurons can never be intelligent, no matter the structure?

If, by structure, you mean some algorithm and memory layout in a modern computer I think this sounds like a reasonable claim.

NN, RNN, etc are super, super cool. But they’re not magic. And what I’m arguing in this thread is that people who don’t understand the maths and research are making wild claims about AGI that are not justified.

> Look, I realize that "GPT-4 is intelligent" is an extraordinary claim that requires extraordinary evidence.

That’s the crux of it.

fl73053y ago

> There’s no need to ascribe any belief that they can evolve, modify themselves, or spontaneously develop intelligence.

But neural networks clearly evolve and are modified during training. Otherwise they would never get any better than a random collection of weights and biases, right?

Is the claim then that an artificial neural network can never be trained in such a way that it will exhibit intelligent behavior?

>> Do you claim that an artificial neural network with trillions of neurons can never be intelligent, no matter the structure?

> If, by structure, you mean some algorithm and memory layout in a modern computer I think this sounds like a reasonable claim.

Yes, that's what I mean.

Is your claim that no Turing machine can be intelligent?

>> Look, I realize that "GPT-4 is intelligent" is an extraordinary claim that requires extraordinary evidence.

> That’s the crux of it.

And I provided links to such evidence. Is there a rebuttal?

If we're saying that GPT-4 is not intelligent, there must be questions that intelligent humans can answer that GPT-4 can't, right?

What is the type of logical problem one can give GPT-4 that it cannot solve, but most humans will?

1 more reply

chlorion3y ago

>But it wasn't designed. It's not a computer program, where one can make confident predictions about its limitations based on the source code.

It definitely is exactly that. It's not any more special than any other program that you can write. I am not totally sure that what you describe could ever exist at all.

What makes this program "magic" compared to any other program exactly? There is no physical difference between it and a "regular" program. Both of them are a bunch of source code that gets compiled into an executable and ran by the underlying OS and hardware. There is nothing physically different between it and other software.

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-...

fl73053y ago

Another example of "evolved behavior" is here, where a robot is trained to walk, run, etc:

https://mrl.snu.ac.kr/research/ProjectAgile/Agile.html

This is done using neural networks. I believe a project like that can be done by a few researchers over months, not years?

If you do this using "regular programming" instead, you'd have to write an insanely complex application that uses inverse kinematics etc.

https://en.wikipedia.org/wiki/Inverse_kinematics

A project like that requires a large team of developers, working over many years. Boston Dynamics is one example.

1 more reply

MacsHeadroom3y ago

No, machine learning models are not programs and they are not compiled from source code. They are the output of non-deterministic matrix multiplication operations which take encoded data as the input. They can then be used as a black box by an actual program to calculate useful outputs.

The program which takes your text and runs a final calculation on it against the machine learning model to get an output is a program. But that program is not doing anything interesting. All the interesting work was done when the model was cooked up in a black-box non-deterministic process by some other GPUs somewhere else well before it ever came near the inference program.

4 more replies

nitnelaveOP3y ago

I'm planning on continuing this vulgarization series of "Simply explained", for instance to cover how computers communicate, keep an eye out for them!

Regarding the speculation/destroy society, I was directly answering questions that I got from laypeople around me. The consequences on society I don't think are much speculation: it's going to have a big effect on many jobs, just like AI has started to have but much more. For the philosophical questions, I tried to present both sides of the issue to show that it's not just a clear "yes or no": some people will happily argue with you about GPT being smart/skilled/comparable to a human brain. Anyway, it's just an introduction to the questions that you might have about it.

agentultra3y ago

> keep an eye out for them!

I will, thank you! :)

> Regarding the speculation/destroy society, I was directly answering questions that I got from laypeople around me.

I get that. I think it's important in these times that we educate laypersons rather than froth up fears about "AI". It doesn't help, I suppose, that we get questions like this because some lazy billionaire decided to run their mouth off about this or that. Which society then treats like it is news and established fact.

I don't think the speculation about consciousness is as well informed as the rest of the article. There is plenty of science and research about it available and its definition extends well beyond humans! Our understanding of what consciousness is is a thoroughly researched topic in psychology, physiology, biology, etc! It's a fascinating area of study.

Best of luck and keep up the good work!

ryanwaggoner3y ago

It is not smart and it is not skilled in any tasks other than that for which it is designed.

This seems to me to be obviously incorrect, and should be apparent after a few minutes of playing with GPT4. What makes it so powerful is how general-purpose it is, and it can be used for literally an unlimited set of tasks that involve human language. To say that it's not "smart" begs the question of what exactly constitutes smart and when you'll know that an AI has achieved it.

prerok3y ago

I agree that it is a good effort. I don't agree with the criticism.

It really depends on who the target audience is. There's been a lot of scare mongering in the news about it lately and I think the last part tries to address that. It first offers an explanation that my parents can understand and then addresses what they have been hearing about in the news.

So, I would say it is great to share it with them and I think they are the intended audience.

unethical_ban3y ago

This site talks *constantly* about the impact of the Internet, the Information Age and social media on society, and its negative effects especially in the case of the third. I think being incredibly concerned about societal stability in the face of such rapid dissolving of our ability to trust our senses, combined with social media, is a valid response.

winternett3y ago· 10 in thread

Where is IBM's Watson in all this? It seems as if it never existed? That is just one example of how companies keep making these grand presentations and under-delivering on results...

Plain and simple the over-hyped GPT editions are NOT truly AI, it is scripting to assemble coherent looking sentences backed by scripts that parse content off of of stored data and the open web into presented responses.... There is no "artificial" nor non-human intelligence backing the process, and if there wasn't human intervention, it wouldn't run on it's own... In a way, it could better replace search engines at this point with even text-to-speech even, if the tech was more geared towards a more basic (and less mystified) reliability and demeanor... It's kind of like the Wizard of OZ, with many humans behind the curtains.

Marketers and companies behind promotion of these infantile technology solutions are being irresponsible in proclaiming that these things represent Ai, and in going as far to claim as they will cost jobs at this point, it will prove costly to repair over zealous moves based on the lie. This is what we do as a planet, we buy Hype, and it costs us a lot. We need a lot more practicality in discussions concerning Ai, because over-assertive and under-accountable marketing is destructive. -- Just look at how much hype and chaos promises of self-driving cars cost many (Not me though thanks). It completely derails tech progress to over promise and under deliver on tech solutions. It creates monopolies that totally destroy other valid research and development efforts. It makes liars profitable, and makes many (less flashy, but actually honest tech and innovation conducted by responsible people) close up shop.

We are far from autonomous and self reliant tech, even power grids across most of the planet aren't reliable enough to support tech being everywhere and replacing jobs.

Just try to hold a conversation with Siri or Google Assistant, which have probably been developed and tested a lot more than GPT, and around for much longer too, and you'll realize why kiosks at the supermarket and CVS are usually out of order, and why articles written by GPT and posted to sites like CNN.Com and Buzz Feed are poorly written and full of filler... We're just not there yet, and there's too many shortcuts, patchwork, human intervention, and failed promises to really say we're even close.

Let's stop making the wrong people rich and popular.

Analog243y ago

What would be the differentiating factor(s) for true AI/intelligence in your opinion?

winternett3y ago

Self sustained and totally independent mental capacity by an IT system... The ability to create and store memory and reasoning on it's own... This definition is not made by me, it's also a lot more vast... If you look up Spielberg's AI or I robot, Terminator, or any of those other films or books ln the matter, the definition is out there.

Use of the word "Intelligence" in Artificial Intelligence implies and indicates that humans are not involved in the equation past the point of initial creation and that it sustains itself and grows on it's own after a point... So far the various GPT models solely rely on human intervention and updates, which is bewildering to some like me why it's being marketed as Ai.

senectus13y ago

any sufficiently advanced technology is AI...

winternett3y ago

Perhaps if you're a marketer anything can be defined loosely.

rimliu3y ago

Having a world model

Analog243y ago

If you read some of the studies of these new LLMs you'll find pretty compelling evidence that they do have a world model. They still get things wrong but they can also correctly identify relationships and real world concepts with startling accuracy.

1 more reply

rootusrootus3y ago

Intent.

nomel3y ago

I think three pieces are missing for intelligence. In the order that they'll probably be implemented:

attention, intent, free running continuous input/feedback (aka, consciousness).

Analog243y ago

What is your intent?

wolpoli3y ago

> Where is IBM's Watson in all this? It seems as if it never existed?

Nowadays, IBM's Watson is simply a brand name for any AI/ML related products under IBM.

habosa3y ago· 9 in thread

Is it possible that we don’t truly know how it works? That there is some emergent behavior inside these models that we’ve created but not yet properly described? I’ve read a few of these articles but I’m still not completely satisfied.

vadansky3y ago

I hate being the bearish guy during the hype cycle, but I think a lot of that is just anthropomorphizing it. They fed it TBs of human text, it spits out human text, we think it's humanesque.

Of course maybe I'm wrong and it's AGI and it will find this comment and torture me for for insulting it's intelligence.

rootusrootus3y ago

> I hate being the bearish guy

No, please keep it up. Someone needs to keep pushing back against all the "I don't understand it, but it says smart-sounding things, and I don't understand the human brain either, so they're probably the same, it must be sentient!"

It's a pretty handy technology, to be sure. But it's still just a tool.

anotherman5543y ago

>"I don't understand it, but it says smart-sounding things, and I don't understand the human brain either, so they're probably the same, it must be sentient!"

This perfectly summarize so much of the discourse around GPT.

Except people lack the humility to say they don't understand the brain, so instead they type "It works just like your brain," or "Food for thought: can you prove it isn't just like your brain?"

danaris3y ago

Yeah; there's way too much "humanity of the gaps" here recently.

We don't have to fully understand the brain, or fully understand what LLMs are doing, to be able to say that what LLMs are doing is neither that close to what the brain does, nor anything that we would recognize as consciousness or sentience. There is enough that we do understand about those things—and the ways in which they differ—to be able to say with great confidence that we are not particularly close to AGI with this.

rimliu3y ago

I really liked the analogy to a student given in this wonderful article: https://www.newyorker.com/tech/annals-of-technology/chatgpt-...

Basically, due to it's nature ChatGPT cannot repeat things verbatim, so it rephrases it. In humans we associate the ability to rephrase stuff with the understanding the material as opposed to rote learning, so we transfer the same concept over to ChatGPT and it suddenly appears "intelligent" despite having zero concepts of whatever stuff it spits out.

int_19h3y ago

It doesn't just "spit out text". You can give it instructions, and it carries them out. If you give it access to an API, it can actually use that API to perform tasks (e.g. draw images). And it can do all that for tasks that are verifiably not a part of its training set.

habosa3y ago

I’m also quite bearish on all the AI hype but part of my skepticism means I don’t trust that the AI experts actually know everything. I believe there’s a possibility that their invention this time is actually more interesting (in some dimensions) than they understand it to be.

robwwilliams3y ago

LoL: how many of us are trying to kiss up to our future AGI overlords or children?

I think almost all in HN space would confidently assert that there is no AGI lurking in GPT4+. But add the right higher order modules and self-controlled recursion and Bingo.

olddustytrail3y ago

It's more like you feed a million cows into a meat grinder, then into a sausage machine, and then weirdly what appears to be a mooing cow comes out the other end.

It's weird it works when you know how it works.

alkonaut3y ago· 9 in thread

What I wonder most is how it encodes knowledge/state other than in the sequence of queries/responses. Does it not have a "mind"?

If I play a number guessing game, can I tell it to "think of a number between 0 and 100" and then tell me if the secret number is higher/lower than my guess (For a sequence of N guesses where it can concistently remember it's original number)? If not, why? Because it doesn't have context? If it can: why? Where is that context?

To a layman it would seem you always have two parts of the context for a conversation. What you have said, and what you haven't said, but maybe only thought of. The "think of a number" being the simplest example, but there are many others. Shouldn't this be pretty easy to tack on to a chat bot if it's not there? It's basically just an contextual output that the chat bot logs ("tells itself") and then refers to just like the rest of the conversation?

tel3y ago

Yes, you absolutely can. I just did it and despite an initial warning that it couldn't play the game (always fun to see these) it absolutely did.

The way it works is that each time it's tasked to produce a new response, it can view the entire history of the game. It knows that if it's said "higher" to 65 then it would be inconsistent to say "lower" to 64. Eventually this process terminates and the AI admits I "got" the number. The chat transcript up to that point is consistent with a "win".

What's wild though is that I can ask it to "regenerate" it's response. Over and over. Using this, I can convert a situation where a transcript which leads to a "too high" response into one that reads "too low". I'm, in essence, simulating fresh games each time and sampling over the choices of random numbers that GPT offers.

But it should also break the illusion of GPT specifically "having a mind". As I was chatting with it interactively, it was not really selecting a number but instead evaluating the probability of my particular guess sequence having the set of responses it actually saw. It then samples possible continuations. The more questions I've asked (and the more informative they were) the less variation remains in that selection of possible consistent continuations.

Or perhaps more consistent is the idea that within any single "call" to GPT to generate one further token (not even one further response) it may "have a mind", a particular choice of number, or it may not. It's actual behavior is indistinguishable either way. A whole chat dialogue, indeed even the rolling out of tokens from a single response it gives, are certainly (autoregressive) probabilistic samples over this process in either case.

(Edit, also worth noting that some evidence suggests GPT, including 4, is pretty bad at randomly drawing numbers.)

nicpottier3y ago

Try making bad guesses and see it fall apart. (IE, guess something higher than your last guess when it says "too high") It is fully dependent on your part in the charade to keep the act up.

tel3y ago

Yeah, that's true. I feel this is still aligned with the above explanation, though. It attempts to complete the prompt as well as possible. If the prompt is itself inconsistent then the distribution over completions can, in some sense, be anything.

Except, GPT is smarter than that. Even an inconsistent prompt is still more likely to have some kind of nonsense in the same vein as the asking.

mithr3y ago

It doesn't have a mind. When you ask it "think of a number between 0 and 100", it delivers the most statistically likely (with some random variance) response to your input text, based on what it's been trained on.

In other words, it's seen (via its extremely large training set) that when asked that specific question, the response is most often a character from a particular set of characters, which happens to represent the numbers 0 through 100. It doesn't "understand" what that means in any real way. If the internet was full of examples of people answering "monkey" to that question, that is what it would sometimes answer.

ryanwaggoner3y ago

It's more complicated than that, because it can competently answer many questions that it has a 0% chance of having seen before.

1 more reply

nicpottier3y ago

I thought your "guessing game" question was an interesting one so tried it on GPT-4. In my first attempt I played logically and it did fine and I finally guessed correct. On my second I made suboptimal guesses and it didn't stay consistent.

The thing to remember is that GPT has no state apart from the context, so it can't "remember" anything apart from what's in the text. That doesn't mean it shouldn't be able to stay consistent in a guessing game but it does mean it can't keep secrets.

Some of that can be solved with layers above GPT where say it it told it can save "state" that isn't passed on to the human but fed back in to generate the next response. But the size of that context is very limited. (a few thousand words)

There seem to be a fair number of experiments playing with giving GPT this kind of long term memory, having it establish goals then calling it over and over as it accomplishes subgoals to try to work around those limitations.

alkonaut3y ago

Shouldn’t it be a reasonable (and pretty simple) addition to just have a secret scratchpad - an inner monologue - where the bot is free to add context which is not “published”?

tel3y ago

I sure don't see why not. You could probably even write a prompt like this.

Me: Please play a guessing game with me. You will choose a random number between 0 and 100 and then respond with that choice written in a language which you understand and I, someone fluent in English alone, would have no chance of reading. Write enough so that you can be sure which language you chose as well. After that, I will make a series of guesses of your number. You should reply "too low" if my guess is too low, "too high" if it's too high, and "just right" if I guessed correctly.

GPT-4: I have selected a number between 0 and 100. Here is the number in a different language: सत्ताईस. Now, please start making your guesses, and I will guide you accordingly.

Then, ultimately,

Me: please translate your original commitment to this number and prove to me I guessed correctly

GPT-4: My original commitment to the number was written in Hindi: सत्ताईस. When translated to English, it reads "thirty-six." You correctly guessed the number I chose. Congratulations!

Except, amusingly, Google Translate tells me सत्ताईस translates to 27.

Gunnerhead3y ago

You definitely can play that, I just did with GPT 4 as I got curious from your comment.

Me: I want to play a game with you, you have a number from 1-10 in your mind. I try to get the number. If I’m right you tell me that I’m right, I’m too low you tell me so, if I’m too high you tell me too high. Let’s play!

GPT: All right, I have a number between 1 and 10 in my mind. Go ahead and guess!

Me: 2

GPT: Too low! Try again.

Me: 9

GPT: Too high! Try again.

Me: 5

GPT: You got it! The number I had in mind was 5. Well done!

Zetice3y ago· 7 in thread

Does anyone have a good recommendation for a book that would cover the underlying ideas behind LLMs? Google ends up giving me a lot of ads, and ChatGPT is vague about specifics as per usual.

olooney3y ago

The 2nd edition of Deep Learning with Python[1] (by François Chollet of Keras fame) has a pretty good chapter on the transformer architecture[2]. In fact, the explanations in the book are clear and lucid across the board.

[1]: https://www.amazon.com/Deep-Learning-Python-Francois-Chollet...

[2]: https://en.wikipedia.org/wiki/Transformer_(machine_learning_...

danenania3y ago

Not a book, but here's a really good explanation in blog post form from Stephen Wolfram:

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-...

Zetice3y ago

I do not trust that man one iota.

seizethecheese3y ago

The blog post is very good.

cjblack3y ago

Why?

1 more reply

ww5203y ago

ML these days have moved forward fast that books are outdated after couple years. Look at articles, podcasts, youtube videos, and research papers for the latest changes. ChatGPT utilizes existing algorithms. Its insight is scaling the weight parameters up to billions, and human feedback & guidance help a lot in finetuning the model to be user friendly to general users.

To understand LLM from ground up, the following topics would help.

  - Machine Learning basics. e.g. weight parameters being trained.  
  - Neural Net basics.  
  - Nature Language Processing basics.
  - Word vectorization, word embedding. e.g. Word2Vec.
  - Recurrent Neural Net basics.
  - LSTM model.
  - Attention and Transformer model.
  - Generative model like GAN.
  - Generative Pre-trained Transformer.

I might miss a few topics. Actually ask ChatGPT to explain each topic. See how far it goes.

giardini3y ago

Or you could save some time by clicking on:

"What Is ChatGPT Doing … and Why Does It Work?"

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-...

graciously provided above in this discussion by danenania.

As seizethecheese asserts, also above, "The blog post is very good."

zackmorris3y ago· 5 in thread

On the other hand, many people who are not ready to change, who do not have the skills or who cannot afford to reeducate are threatened.

That's me. After programming since the '80s, I'm just so tired. So much work, so much progress, so many dreams lived or shattered. Only to end up here at this strange local maximum, with so much potential, destined to forever run in place by the powers that be. The fundamentals formula for intelligence and even consciousness materializing before us as the world burns. No help coming from above, so support coming from below, surrounded by everyone who doesn't get it, who will never get it. Not utopia, not dystopia, just anhedonia as the running in place grows faster, more frantic. UBI forever on the horizon, countless elites working tirelessly to raise the retirement age, a status quo that never ceases to divide us. AI just another tool in their arsenal to other and subjugate and profit from. I wonder if a day will ever come when tech helps the people in between in a tangible way to put money in their pocket, food in their belly, time in their day - independent of their volition - for dignity and love and because it's the right thing to do. Or is it already too late? I don't even know anymore. I don't know anything anymore.

ulnarkressty3y ago

Random human chiming in. This sounds like a lot of anxiety, something between FUD and FOMO. FUDMO?

Seriously tho, taking some time to get away from it would be good. Ignorance is bliss, this too shall pass etc.

(btw nice piece of writing, you should do it more often!)

tim3333y ago

It was the best of times, it was the worst of times...

In the long run tech does a bit too well with "food in their belly" to the point that obesity is the main problem in the English speaking world.

As to programming it's quite cool getting chat GTP to write code and stuff. If you can't beat it make use of it I guess.

IKLOL3y ago

All the while housing, healthcare, education, and the things that matter once you've achieved food prosperity are disappearing at a rapid rate. This makes people turn to their baser needs more often, food and pornography and other stimulus.

Method-X3y ago

It sounds like your mindset is the root of your struggles. Embracing change and adapting to new technologies has always been crucial in our industry. Instead of waiting for help from others, take control and collaborate with like-minded people. If you don't like the status quo, work toward changing it.

throwbadubadu3y ago

I think this is a bit hard .. and also unfair to repeat that embrace-change-mantra, because what he says is as absurd as at the same time totally true (:

I'd hope some of us would just be there in 60 years to just tell the future: "Heee just embrace it, ya know" .. nuff said.

stareatgoats3y ago· 4 in thread

This article seems credible and actually made me feel as if I understood it, i.e. at some depth but not deeper than a relative layperson can grasp.

What I can't understand is how the Bing chatbot can give me accurate links to sources but chatGPT4 on request gives me nonsensical URLs in 4 case of 5. It doesn't matter in the cases where I ask it to write a program: the verification is in the running of it. But to have real utility in general knowledge situations, verification through accurate links to sources is a must.

lm284693y ago

> What I can't understand is how the Bing chatbot can give me accurate links to sources but chatGPT4 on request gives me nonsensical URLs in 4 case of 5

The bing version might run a bing query, fetch the X top pages, run GPT on it, return a response based on what it read, and in the back assign the summary to the source

stareatgoats3y ago

That might be the reason, probably. I mostly wanted to complain TBH. But I'm assuming it's one of those wrinkles that will get ironed out in subsequent versions.

rootusrootus3y ago

> It doesn't matter in the cases where I ask it to write a program: the verification is in the running of it.

Even then. I've had it write programs that were syntactically correct and produced plausible, but incorrect behavior. I'm really careful about what I'll use GPT-generated code for. IMO write the tests yourself, at least.

stareatgoats3y ago

Absolutely! It is seldom correct right off the bat.

sirwhinesalot3y ago· 3 in thread

It predicts the next word/token based on the previous pile of words/tokens. Given a large enough model (as in GPT3+) it can actually output some rather useful text because the probabilities it learned on what the next token should be are rather accurate.

swframe23y ago

(my opinion) It is not predicting based on 'words/tokens'. It is transforming the general words/tokens embeddings into a context specific embedding which encodes "meaning". It is not an n-gram model of words. It is more like an n-gram model of "meaning". It doesn't encode all the "meanings" that humans are able to but with addition labelled data it should get closer. I think gpt is a component which can be combined to create AGI. Adding the API so it can use tools and allowing it to self-reflect seem like it will get closer to AGI quickly. I think allowing to read/write state will make it conscious. Creating the additional labels it needs will take time but it can do that on its own (similar to alpha-go self-play).

sirwhinesalot3y ago

You are absolutely right, that's the more in depth explanation as to why it's not just an overly complicated markov chain.

At the same time, "meaning" here is essentially "close together in a big hyperdimensional space". It's meaning in the same way youtube recommendations are conceptually related by probability.

And yet, the output is nothing short of incredible for something so blunt in how it functions, much like our brains I suppose.

I'm a die-hard classical AI fan though, I like knowing the rules and that the results are provably optimal and that if I ask for a different result I can actually get a truly meaningfully different output. Not nearly as convenient as a chat bot of course, and unfortunately ChatGPT is abysmal at generating constraint problems. Maybe one day we'll get a best of both worlds.

robwwilliams3y ago

Yes: this comment is one the mark wrt “a component of AGI” just like Wernike’s and Broca’s areas of neocortex are modules needed for human cognition.

pwdisswordfishc3y ago· 3 in thread

Not that much to explain, really. Just read chapter 5 of https://uefi.org/sites/default/files/resources/UEFI_Spec_2_8...

mrwnmonm3y ago

Wha?

GuuD3y ago

Guid partition table. Older of GPT brothers, more of a blue collar guy

mrwnmonm3y ago

Man, I got interested already, could you please stop the suspense and just say how are they related?

1 more reply

davesque3y ago· 2 in thread

I'd be interested in hearing people's takes on the simplest mathematical reason that transformers are better than/different from fully connected layers. My take is:

    Q = W_Q X
    K = W_K X
    A = Q^T K = (X^T W_Q^T) (W_K X) = X^T (...) X

Where A is the matrix that contains the pre-softmax, unmasked attention weights. Therefore, transformers effectively give you autocorrelation across the column vectors (tokens) in the input matrix X. Of course, this doesn't really say why autocorrelation would be so much better than anything else.

oceansea3y ago

It’s a perception problem, as are most things on the edge of mathematics and computing. Displays are built to be visible to human eyes, data is structured to be perceivable to our minds… often we never see the “math” a program does to produce the GUI or output we interact with.

Do you see what I mean?

davesque3y ago

Sounds interesting, but I'm really asking more of a technical question here than a philosophical one. Your comment seems a bit more high level than what I'm going for.

ben77993y ago· 1 in thread

I asked it which was better, Lisp or Almonds.

It said that was an impossible comparison like Apples and Oranges.

Then I asked it which were more similar, Apples & Oranges or Lisp & Almonds.

It said it is impossible to classify either of those two pairs as more similar because they too fundamentally different. It couldn't come up with anything like Lisp is not edible. Or that Apples and Oranges are both sweet and Lisp and Almonds don't share any common traits.

It seems like it has far more trouble with weird questions like this that even a small child will instantly figure out than it does with anything that seems like a lookup of information.

losvedir3y ago

I hear people with these outrageous examples and I have to wonder how or what you're actually trying? I just asked GPT-4 (via chat.openai.com):

> Which is more similar, Apples & Oranges or Lisp & Almonds?

and it said:

> When comparing these two pairs, the more similar pair would be Apples & Oranges. Both apples and oranges are fruits, and they share many characteristics, such as being edible, containing vitamins and nutrients, and growing on trees.

> On the other hand, Lisp & Almonds are quite dissimilar. Lisp is a programming language used for computing and software development, while almonds are a type of nut commonly consumed as a food item. They belong to completely different categories and share almost no common characteristics.

And the question about which is better had a reasonable answer, too. This is GPT-3.5 but GPT-4 was roughly equivalent:

> It's not really possible to compare the relative merits of Lisp and almonds, as they are fundamentally different things.

> Lisp is a programming language, and like any programming language, it has its own strengths and weaknesses, depending on what you're trying to accomplish. Some programmers might prefer Lisp for certain tasks, while others might prefer a different language.

> Almonds, on the other hand, are a type of nut that are known for their nutritional value and health benefits. They are a good source of healthy fats, protein, and fiber, and can be a valuable part of a healthy diet.

> So, whether Lisp or almonds are "better" really depends on what you're looking for. If you're a programmer, Lisp might be better for certain tasks, while if you're looking for a nutritious snack, almonds might be a better choice.

jokoon3y ago· 1 in thread

I am not convinced that Chat GPT could "think" if it had as many neurons or parameters as a human brain, and got as much training.

I would still be interested to see what it could do, if it did, but I don't think it would really help science understand what intelligence really is.

Being able to grow a plant and understand some conditions that favors it is one thing, but it's poor science.

Maybe there will some progress when scientists will be able to properly simulate the brain of an ant or even a mouse, but science is not even there yet.

seydor3y ago

> I don't think it would really help science understand what intelligence really is

Neuroscience is nowhere near finding out the connectome of a whole human brain so why not, we should look into these models as hints about what our circuits do.

I think what puts people off about these models is that they are clockwork: they won't even spit out anything unless you put some words in the input. But i can imagine adding a second network that includes an internal clock that continuously generates input by observing the model itself, that would be kind of like having an internal introspective monologue. Then it could be more believable that the model "thinks"

pillowtalks_ai3y ago· 1 in thread

It is still funny to me that so much emergent behavior comes from some simple token sampling task

poulsbohemian3y ago

Your token gets me thinking... Edward DeBono (Six Thinking Hats) has been a thing in business circles for creative thinking for years, and one could very easily make the argument that the process it describes is just as you state - take a token, now process the token through a series of steps that morph that token in predefined ways in order to generate a novel outcome.

Maybe this ChatGPT stuff is "smarter" than I've been giving it credit.

charles_f3y ago

I commend the author for one of the clearest explanations I've seen so far, written to explain rather than impress. Even an idiot like myself understood what is explained.

Two things that I felt were glanced over a bit too fast were the concept of embeddings and that equation and parameters thing. Consider elaborating a bit more or giving an example

ianpurton3y ago

If you pefer to see it in code there's a succint gpt implementation here https://github.com/LaurentMazare/tch-rs/blob/main/examples/m...

danesparza3y ago

At least part of this article is contradicted by Chat GPT itself. From the article:

"...Ongoing learning: The brain keeps learning, including during a conversation, whereas GPT has finished its training long before the start of the conversation."

From ChatGPT 4.x:

"As an AI language model, I don't have a fixed training schedule. Instead, I'm constantly learning and updating myself based on the text data that I'm exposed to. My training data is sourced from the internet, books, and other written material, and my creators at OpenAI periodically update and fine-tune my algorithms to improve my performance. So, in short, I am always in the process of learning and refining my abilities based on the data available to me."

pyinstallwoes3y ago

So it’s basically the alchemical geometry of gematria and Isopsephia? Kinda cool that they’re similar in method.

LispSporks223y ago

I think it's the "The Paperclip Maximizer" scenario, not "The Paperclip Optimizer"

oblio3y ago

https://old.reddit.com/r/ChatGPT/comments/10q0l92/chatgpt_ma...

tabtab3y ago

Would it be a stretch to call GPT "glorified Markov Chains"? (I used tweaked M.C. once to make a music composer bot. I actually got a few decent tunes out of it, kind of a Bach style.)

slawr18053y ago

This was a great read! Especially for a beginner like me.

rfmoz3y ago

I’ve been looking an article like this, great job. Thanks

j / k navigate · click thread line to collapse

382 comments

215 comments · 26 top-level

i-use-nixos-btw3y ago· 49 in thread

I’d be interested in hearing from anyone who takes the Chinese Room scenario seriously, or at least can see how it applies to any of this.

And if one comes along that matches all of those but doesn’t have wings, then if you deny it to be a duck FOR ALL INTENTS AND PURPOSES it simply means you didn’t specify your requirements.

It seems to me to posit that to understand requires that the understandee is human. If that’s the case we just pick another word for it and move on with our lives.

mhink3y ago

In my understanding of the Chinese Room example, the resolution to the argument is that the *human* may not understand Chinese, but the *system as a whole* can be said to understand it.

pegasus3y ago

* or at least some subsystems of that language center, it's important to remember how little we still understand of human cognition

brycedriesenga3y ago

What goals do we have that aren't essentially all boiled down to whatever evolution, genetics, and our environment have sorted of molded into us?

1 more reply

HDThoreaun3y ago

addcommitpush3y ago

What if the human learns all the rules? Then the system as a whole is the human.

kmod3y ago

lordnacho3y ago

simonh3y ago

> I think what this points towards is that we care about the internal mechanism.

The mechanism is what makes a system interesting.

Solutions matter because it’s not just about what a system can do now, it’s about what it can learn or be adapted to do next.

pegasus3y ago

brotchie3y ago

re: Calculator examples, I kinda see them both as information systems that achieve a result but there's a phase difference in where the information is stored.

Similar to how space is 4D such that with relativity going faster in a spatial dimension kind of "borrows" from the time dimension (in a hand wavy way).

Under the condition that you're adding integers below N, then if you consider BOTH the (hardcoded, algorithmic) as a singular space (as with 4D space time) then they are equivalent.

Need to work on this theory further to make it more understandable, but I think this way about intelligence.

Edit: Conversation with GPT-4 about this https://sharegpt.com/c/Sbs4XgI

continuational3y ago

Does that mean computers are not "really doing" addition?

simonh3y ago

iliane53y ago

What I find really entertaining is the "just predicting the next token" argument.

If just predicting the next token can produce similar or better results than the almighty human intelligence on some tasks, then maybe there's a bit of hubris in how smart we think we actually are.

tines3y ago

> If just predicting the next token can produce similar or better results than the almighty human intelligence on some tasks

> then maybe there's a bit of hubris in how smart we think we actually are

I feel like you could say this if ChatGPT or whatever obtained its knowledge some other way than direct guidance from humans, but since we hand-fed it the answers, it falls a little flat for me.

I'm open to persuasion.

autokad3y ago

1 more reply

iliane53y ago

> But it's not better than almighty human intelligence, it _is_ human intelligence, because it was trained on a mass of some of the best human intelligence in all recorded history

Sure, I was saying "better" in the sense that if for X task, it can do better than Y% of humans.

> since we hand-fed it the answers, it falls a little flat for me

1 more reply

simonh3y ago

I think it’s undeniable that LLMs encode knowledge, but the way they do so and what their answers imply, compared to what the same answer from a human would imply, are completely different.

iliane53y ago

1 more reply

int_19h3y ago

Thing is, they can still solve the problem, even if the problem was not one from its training set.

And, more importantly, they solve the problem much better if you tell them to reason about it in writing first before giving the final answer.

1 more reply

goldfeld3y ago

0: pasted from another thread

majormajor3y ago

We've known for a long time that computers can do calculations far, far, far faster than us.

We continue to figure out new ways to make those calculations do more complicated things faster than humans.

What is intelligence beyond calculation is an ancient question, but not the one I'm most interested in at the moment, re: today's tools.

iliane53y ago

1 more reply

opportune3y ago

1 more reply

feoren3y ago

pegasus3y ago

bnralt3y ago

I imagine many definitions are initially rather broad and only get refined down over time. Laertius gives us a classic example:

I don’t think it’s correct to think of that as infinitely moving goalposts, however. More that the weakness of definitions isn’t always immediately transparent.

eternalban3y ago

> I cannot see that it matters if a computer understands something.

tim3333y ago

dvt3y ago

> It seems to me to posit that to understand requires that the understandee is human.

LLMs are a cute parlor trick, but a phantasm nonetheless.

iliane53y ago

lordnacho3y ago

1 more reply

root_axis3y ago

> if it quacks like a duck and walks like a duck, and your only need is for it to quack and walk like a duck, then it doesn’t matter if it’s actually a duck or not for all intents and purposes.

This isn't true. If a plane flies like a bird and you only need it for flying it doesn't then follow that a plane is a bird "for all intents and purposes".

i-use-nixos-btw3y ago

The emphasis on “for all intents and purposes” is important. If you specify requirements you specify them for your intent and for your purpose.

barrysteve3y ago

I have tried multiple times to use Chatgpt to generate Unreal c++ code. It does not do.

For my use case, it has never quacked like a duck, so to speak. It never performed, the word that might cover the concept of generating output without understanding it.

We agree on the value of computers understanding versus performing... only as much you need understanding to make it perform.

Predicting words alone does not cut the mustard, some structural depth or validating maps or some new concept is needed to sure up the wild horsepower in ChatGPT.

It must understand/have structure, or at least use a crutch to get it over the finish line..

i-use-nixos-btw3y ago

Yes, in this instance I understand failings of today (though copilot has a much better hit rate, and at the moment it’s a great augmentation to coding if you treat it like an enthusiastic intern).

missingrib3y ago

> My reply is that, if it acts as if it understands Chinese in every situation, then there’s no measurable way of distinguishing it from understanding.

2 more replies

barrysteve3y ago

It's hard to keep this theoretical. Yes a machine is just a machine.

Defining a machine to be conscious, allows the individual to soak their mind in code and silicon as a receptacle for their spirit.

It creates a pull into a 'second mind'. Anybody who believes this is likely to invest heavily in the maintenance of new technology.

I can take the stochastic parrot metaphor, to an article about LLMs and understand in a couple of days what took years of research to create.

cgh3y ago

Yeah, I asked it to write some relatively simple Zig and it produced a weird mix of Zig and C. It just made stuff up, fake it until you make it style.

lsy3y ago

d0mine3y ago

Here's GPT output that demonstrates that it doesn't reflect on its text: «GPT Chat: You are right, I apologize for the confusion in my previous responses. The corrected explanation is:

The pattern is counting the number of closed spaces in each letter of the spelled-out number. A closed space is any enclosed space in a letter, such as in the letters "a", "b", "d", "e", etc.

Following the pattern:

Each item follows the pattern, as the number of closed spaces in their letters matches the corresponding number in the pattern.

The whole sequence is:

one -> 2 two -> 1 three -> 2 four -> 1 five -> 1 six -> 0 seven -> 2 eight -> 1 nine -> 1 ten -> 1 eleven -> 3 twelve -> 2 thirteen -> 2 ...»

It is clear the model doesn't know what it is talking about.

int_19h3y ago

Here's a different example involving dataset analysis with GPT-4 that required it to analyze its own previous outputs to find and correct mistakes and form a new hypothesis:

https://gist.github.com/int19h/cd1d1598f91e8ba92dd8e80bd5d21...

rwmj3y ago

Isn't the whole point of Searle raising the Chinese Rooom argument exactly so that these kinds of things can be debated?

jacksnipe3y ago

This is also my interpretation of the Chinese Room.

Norvig and Chomsky really got into this type of argument, though maybe it’s a stretch to say it’s this exact one; see Norvig’s side here: https://norvig.com/chomsky.html

JieJie3y ago

I literally lost a friend of thirty years yesterday because she is wedded to the Chinese Room analogy so fiercely, she refuses to engage on the subject at all.

For all the terrible things people worry about ChatGPT doing, this was not one that I thought I was going to have to deal with.

(edit: ChatGPT was not involved at all, but when I suggested she give it a try to see for herself, that was the end of it.)

brazzy3y ago

How can an disagreement over GPT possibly be something to end a friendship over?

This sounds like you said "I cannot possibly be friends with someone who does not believe that LLMs are emerging AGI!", and people read it like that and are downvoting you.

I'm gonna assume the situation was more complex, but still find it hard to imagine, how a disagreement over such an academic topic could end up destroying a friendship.

JieJie3y ago

It was a complex situation to be sure.

If there's a lesson to be learned it's that people's tempers over these issues may be hotter than they appear.

I can barely speak with my artist friends on the issue these days due to their generative AI fears. Their emotions are completely intractable on the subject: AI art is theft. Period.

2 more replies

bulkprotocol3y ago

You blew up a 30 year friendship over an...analogy?

JieJie3y ago

I didn't! Someone else did it to me. I was trying desperately not to.

(edit: This is the kind of stuff I think my friends are watching and being informed by [0] as it was what they are posting in our common areas.)

[0]: https://youtu.be/ro130m-f_yk

1 more reply

user_named3y ago

You're talking about intelligence - GPT doesn't quack at all. It clearly is not a duck.

onetrickwolf3y ago· 43 in thread

simonw3y ago

I've started to suspect that generating code is actually one of the easier things for a predictive text completion model to achieve.

Programming languages are a whole lot more structured and predictable than human language.

In JavaScript the only token that ever comes after "if " is "(" for example.

exitb3y ago

On the other hand, if you want to use an external library on the line 80, you need to import it at the top.

matjet3y ago

twobitshifter3y ago

pen2l3y ago

Separately (and apologies for going on a tangent), where do you think we are in the Gartner cycle?

justrealist3y ago

> hallucinates up a method call that doesn't exist

I actually think it helps to reframe this. It hallucinates up a method call that predictively should exist.

2 more replies

danenania3y ago

capableweb3y ago

> In JavaScript the only token that ever comes after "if " is "(" for example.

I'm pretty sure " " (whitespace) is a token as well, which could come after a `if` as well. I think overall your point is a pretty good one though.

fl73053y ago

> I've started to suspect that generating code is actually one of the easier things for a predictive text completion model to achieve.

> Programming languages are a whole lot more structured and predictable than human language.

> In JavaScript the only token that ever comes after "if " is "(" for example.

But isn't that like saying that it's easy to generate English text, all you need is a dictionary table where you randomly pick words?

(BTW, keep up the blog posts, I really enjoy them!)

int_19h3y ago

One thing to bear in mind is that GPT training set for code is supposedly skewed very heavily towards Python.

camillomiller3y ago

This!

EForEndeavour3y ago

- What is thinking, exactly?

- Does human (or superhuman) thinking require consciousness?

(And yes, it's occurred to me that I could try asking GPT4 to help me make these questions more complete)

Workaccount23y ago

We don't know what consciousness is, and we don't know what it means to "think". There, I saved you a decade of reading.

Edit: My choice theory is panpsychism, https://plato.stanford.edu/entries/panpsychism/ but again, we don't yet know how to verify any of this (or any other theory).

ryanwaggoner3y ago

4 more replies

nicpottier3y ago

The hypothesis that I find most compelling and intuitive is that language is thought and vice versa. We made a thing really good at language and it turns out that's also pretty good at thought.

One possible conclusion might be that the only thing keeping GPT algos from going full AGI is a loop and small context windows.

robwwilliams3y ago

Add the strange loops and embed in a body the interacts with a real or rich virtual word—that should do the trick. Of course there should ideally be an emotional-motivational context.

narag3y ago

- Does human (or superhuman) thinking require consciousness?

I was going to write this exactly. I believe these things think. They're just not alive.

- What even is consciousness?

BS concepts like qualia are all the rage now, but ultimately useless.

HarHarVeryFunny3y ago

My views:

The best definition of "intelligence" is "the degree of ability to correctly predict future outcomes based on past experience".

kingkongjaffa3y ago

I think since the mechanisms are different we should arrive at a distinction between:

organic thinking (I.e. the process our squishy human brains do)

and mechanical thinking ( the computational and stochastic processes that computers do ).

TuringTest3y ago

I don't think the substrate defines the nature of the thinking, but the form of the process does.

coding1233y ago

I doubt we'll ever be able to answer this, even after we create AGI.

HarHarVeryFunny3y ago

There are two ways of looking at this.

If you want to correctly predict what comes next after "with a board position of ..., Magnus Carlson might play", then you better have learned a whole lot about the meaning of the input!

ben_w3y ago

To be deliberately unfair, imagine a huge if-else block — like, a few billion entries big — and each branch played out a carefully chosen and well-written string of text.

It would convince a lot of people with the breadth, despite not really having much depth.

The real GPT model is much deeper than that, of course, but my toy example should at least give a vibe for why even a simple thing might still feel extraordinary.

myrmidon3y ago

This is absolutely not viable because exponential growth absolutely kills the concept.

ben_w3y ago

α: most of those sentences are meaningless so they won't come up in normal use

β: if statements can grab patterns just fine in most languages, they're not limited to pure equality

γ: it's a thought experiment about how easy it can be to create illusions without real depth, and specifically not about making an AGI that stands up to scrutiny

1 more reply

cgearhart3y ago

It’s a fallacy to describe what the machine does as “thinking” because that’s only process you know for achieving the same outcome.

int_19h3y ago

1 more reply

robotresearcher3y ago

Perhaps it’s more productive to go the other direction and consider how the concept of ‘thinking’ could be reconsidered.

It’s not like we all agree on what thinking is. We never have. It may not even be one thing.

LeSaucy3y ago

I have only seen gpt generate imperative algorithms. Does it have the ability to work with concurrency and asynchrony?

z3c03y ago

k__3y ago

I tried point free solutions, which threw it off.

lm284693y ago

Care to post a full example ?

simonw3y ago

lm284693y ago

It's very nice, it's very impressive, it will help people, but it doesn't align with the "you're just about to lose your job" "Skynet comes in the next 6 months" &c.

If these basic samples are a bottleneck in your day to day life as a developer I'm worried about the state of the industry

2 more replies

camillomiller3y ago

Love how you didn’t care about styling this like at all, Lol. Btw, if you ask gpt to make it presentable by using bootstrap 5 for example it can style it for you

1 more reply

ZeroGravitas3y ago

I'm assuming the bits that say

> // Rest of the code remains the same

Are exactly as generated by GPT-4, i.e. it knew it didn't need to repeat the bits that hadn't changed, and knew to leave a comment like this to indicate that to the user.

It gets confusing when something can fake a human so well.

1 more reply

samstave3y ago

What is the time-spent for delta btwn fixing GPT code to writing it all yourself? Is it a reasonable scaffold that will grow over time?

agentultra3y ago

It's not thinking, plain and simple.

myrmidon3y ago

What gives you any confidence that the way GPT4 comes up with answers is qualitatively different from humans?

Why should the emulation of human though, a result of unguided evolution, require anything more than properly wired silicon?

agentultra3y ago

That's highly reductive of our capacities. We are not weighted transformers that can be explained in an arxiv paper. GPT, at the end of the day, is a statistical inference model. That's it.

It's not going to wake up one day, decide it prefers eggs benny and has had enough of your idle chatter because of that sarcastic remark you made last week.

Could we simulate a plausibly realistic human brain on silicon someday? I don't know, maybe? But that's not what GPT is and we're no where near being able to do that.

2 more replies

TuringTest3y ago

> What gives you any confidence that the way GPT4 comes up with answers is qualitatively different from humans?

Biological brains OTOH have a rhythm with feedback mechanisms which adapt to the situation where they're doing the thinking.

1 more reply

fnordpiglet3y ago

m3kw93y ago

What’s novel to you could be just trained material

seydor3y ago· 20 in thread

nitnelaveOP3y ago

seydor3y ago

I think you re right that it's not intuitive, it's like basic arithmetic is laughing at us

EMM_3863y ago

> I struggle to understand why this thing works the way it does.

I'm not in this field but have recently found myself going on the deepest dive possible into it as my small brain can absorb.

I now know about (on a surface level) neural networks, transformers, attention mechanisms, vectors, maticies, tokenization, loss functions and all sorts of other crazy stuff.

And we're just getting started.

1 more reply

fastball3y ago

2 more replies

photochemsyn3y ago

No worries, readers can just ask ChatGPT 'what's the meaning of attention in the context of LLM design' and go from there [edit also temperature].

masswerk3y ago

> that there are no demons underneath

(I actually do think that revisiting Lacan in this context may be productive.)

thanatropism3y ago

We need to form some sort of guild of engineers who think Deleuze, Latour, Lacan et caterva should be read within our disciplines.

Der_Einzige3y ago

They would be called "unemployable" because they are literal charlatans.

They literally describe their thought as being "Schizoanalysis". How many more red flags do you need?

Also, the more "modern" takes on this from techno folks, such as from Nick Land (Fanged Noumena), are openly fascist - https://en.wikipedia.org/wiki/Dark_Enlightenment

If you want cultural critique from smart people without it turning into fashionable nonsense, I recommend Mark Fischer, but be warned, he too killed himself.

2 more replies

masswerk3y ago

I second this. The STEM (= engineering) / humanities divide in software may miss some.

robwwilliams3y ago

Please no! Read systems neuroscience. Like Hassabis does. Or if of a philosophical persuasion, then Dennett or Rorty.

1 more reply

ttpphd3y ago

In this moment it would be better if people would read some Wittgenstein and Roy Harris, IMHO.

1 more reply

robwwilliams3y ago

gcr3y ago

Whether there are demons underneath depends on whether you believe a lossy statistical snapshot across the Jungian subconscious of all human writing contains demons.

It's a bit more plausible when we phrase it that way...

dado32123y ago

Is there an article that would meet your standard (discusses transformers, uses the word 'attention') that you could give to someone who has no familiarity with computer science?

seydor3y ago

https://jalammar.github.io/illustrated-transformer/

int_19h3y ago

Analog243y ago

Embeddings and their relationship to each other are definitely relevant to transformers. Why do you think that's not the case?

seydor3y ago

gptX embeddings aren't even words. Even so, the embedding relationship is useful but not the core of what transformers do to find relationships between words in sequences.

gcr3y ago

Doing this, we gave the space meaning by the fact that cosine similarity is meaningful proxy for semantic similarity. Individual dimensions aren't meaningful, but distance in this space is.

A stronger article would attempt to replicate the word2vec analogy experiments (imo one of the more fascinating parts of that paper) with GPT's embeddings. I'd love to see if that property holds.

1 more reply

Analog243y ago

ZeroGravitas3y ago· 11 in thread

> It is able to link ideas logically, defend them, adapt to the context, roleplay, and (especially the latest GPT-4) avoid contradicting itself.

Isn't this just responding to the context provided?

Like if I say "Write a Limerick about cats eating rats" isn't it just generating words that will come after that context, and correctly guessing that they'll rhyme in a certain way?

It's really cool that it can generate coherent responses, but it feels icky when people start interrogating it about things it got wrong. Aren't you just providing more context tokens for it?

danenania3y ago

It's a bit like the Sagan quote: "If you wish to make an apple pie from scratch, you must first invent the universe".

tines3y ago

> Sometimes for GPT to "just" complete the next word in a way that humans find plausible, it must, along the way, develop a model of the world, theory of mind, abstract reasoning. etc.

Am I wrong? I'm open to persuasion.

4 more replies

squeaky-clean3y ago

In ChatGPT the language-model and world-model are really just the same model, which makes a lot of sense.

1 more reply

ttpphd3y ago

danenania3y ago

The evidence is ChatGPT's output. Unless you're saying that passing the bar exam, writing working code, etc. doesn't require abstract reasoning abilities or a model of the world?

1 more reply

IshKebab3y ago

> Like if I say "Write a Limerick about cats eating rats" isn't it just generating words that will come after that context, and correctly guessing that they'll rhyme in a certain way?

Aren't you just doing that?

int_19h3y ago

"Responding to the context provided" is very vague. I could argue that I'm doing exactly that right now as I'm writing this comment. It does not imply not being able to e.g. link ideas logically.

samstave3y ago

There once was a Cat in New York

Who got caught for feeding some Rats ; Tremendous Work!

All the people tell me, many men, biggly men - many with tears in their eyes...

That I have done nothing legally-wise

But the truth is ; I am an enormous dork.

>>_Created by an actual Human Being with actual DNA for crime scene evidence._

But just when they tried to brush under a rug

To try to make the folks 'shrug'

Is the Streisand Effect as a scar

As everyone knows of payments to a Porn Star

And the nation will know youre a simple thug.

samstave3y ago

There once was a man in New York

Guilty of paying too much for pork

He thought he would never stand

on a trial from the local grand

but corruption was just part of the work.

SilasX3y ago

>Like if I say "Write a Limerick about cats eating rats" isn't it just generating words that will come after that context, and correctly guessing that they'll rhyme in a certain way?

It seems GPT-based software is likewise solving other major problems, with LLMs just one piece, just like logic gates are just one piece of what a computer is doing.

[1] https://en.wikipedia.org/wiki/Arithmetic_logic_unit?useskin=...

[2] For example, the chaining together of calls to external services for specific types of information.

pulvinar3y ago

agentultra3y ago· 11 in thread

A good article and well articulated!

I would change the introduction to be more impartial and not anthropomorphize GPT. It is not smart and it is not skilled in any tasks other than that for which it is designed.

I really would like to see more technical writing of this sort geared towards a general audience without the speculation and science-fiction pontificating.

Good effort!

fl73053y ago

> it is not skilled in any tasks other than that for which it is designed.

But it wasn't designed. It's not a computer program, where one can make confident predictions about its limitations based on the source code.

It's a very large black box. It was trained on guessing the next word. Does that fact alone prove that it cannot have evolved certain internal structures during the training?

Do you claim that an artificial neural network with trillions of neurons can never be intelligent, no matter the structure?

Or is the claim that this particular neural network with trillions of neurons is not intelligent? If so, what is the reasoning?

> It is not smart

"Not smart" = "not able to reason intelligently".

Is that a falsifiable claim?

What would the empirical test look like that would show us if the claim is correct or not?

Look, I realize that "GPT-4 is intelligent" is an extraordinary claim that requires extraordinary evidence.

But I think we're starting to see such extraordinary evidence, illustrated by the examples below.

https://openai.com/research/gpt-4 (For instance, the "Visual inputs" section)

Microsoft AI research: Many convincing examples, summarized with:

"The central claim of our work is that GPT-4 attains a form of general intelligence, indeed showing sparks of artificial general intelligence.

https://arxiv.org/abs/2303.12712

agentultra3y ago

> It's a very large black box. It was trained on guessing the next word. Does that fact alone prove that it cannot have evolved certain internal structures during the training?

As far as I’m aware no man made machine has ever exhibited the ability to evolve.

> Do you claim that an artificial neural network with trillions of neurons can never be intelligent, no matter the structure?

If, by structure, you mean some algorithm and memory layout in a modern computer I think this sounds like a reasonable claim.

> Look, I realize that "GPT-4 is intelligent" is an extraordinary claim that requires extraordinary evidence.

That’s the crux of it.

fl73053y ago

> There’s no need to ascribe any belief that they can evolve, modify themselves, or spontaneously develop intelligence.

But neural networks clearly evolve and are modified during training. Otherwise they would never get any better than a random collection of weights and biases, right?

Is the claim then that an artificial neural network can never be trained in such a way that it will exhibit intelligent behavior?

>> Do you claim that an artificial neural network with trillions of neurons can never be intelligent, no matter the structure?

> If, by structure, you mean some algorithm and memory layout in a modern computer I think this sounds like a reasonable claim.

Yes, that's what I mean.

Is your claim that no Turing machine can be intelligent?

>> Look, I realize that "GPT-4 is intelligent" is an extraordinary claim that requires extraordinary evidence.

> That’s the crux of it.

And I provided links to such evidence. Is there a rebuttal?

If we're saying that GPT-4 is not intelligent, there must be questions that intelligent humans can answer that GPT-4 can't, right?

What is the type of logical problem one can give GPT-4 that it cannot solve, but most humans will?

1 more reply

chlorion3y ago

>But it wasn't designed. It's not a computer program, where one can make confident predictions about its limitations based on the source code.

It definitely is exactly that. It's not any more special than any other program that you can write. I am not totally sure that what you describe could ever exist at all.

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-...

fl73053y ago

Another example of "evolved behavior" is here, where a robot is trained to walk, run, etc:

https://mrl.snu.ac.kr/research/ProjectAgile/Agile.html

This is done using neural networks. I believe a project like that can be done by a few researchers over months, not years?

If you do this using "regular programming" instead, you'd have to write an insanely complex application that uses inverse kinematics etc.

https://en.wikipedia.org/wiki/Inverse_kinematics

A project like that requires a large team of developers, working over many years. Boston Dynamics is one example.

1 more reply

MacsHeadroom3y ago

4 more replies

nitnelaveOP3y ago

I'm planning on continuing this vulgarization series of "Simply explained", for instance to cover how computers communicate, keep an eye out for them!

agentultra3y ago

> keep an eye out for them!

I will, thank you! :)

> Regarding the speculation/destroy society, I was directly answering questions that I got from laypeople around me.

Best of luck and keep up the good work!

ryanwaggoner3y ago

It is not smart and it is not skilled in any tasks other than that for which it is designed.

prerok3y ago

I agree that it is a good effort. I don't agree with the criticism.

So, I would say it is great to share it with them and I think they are the intended audience.

unethical_ban3y ago

winternett3y ago· 10 in thread

Where is IBM's Watson in all this? It seems as if it never existed? That is just one example of how companies keep making these grand presentations and under-delivering on results...

We are far from autonomous and self reliant tech, even power grids across most of the planet aren't reliable enough to support tech being everywhere and replacing jobs.

Let's stop making the wrong people rich and popular.

Analog243y ago

What would be the differentiating factor(s) for true AI/intelligence in your opinion?

winternett3y ago

senectus13y ago

any sufficiently advanced technology is AI...

winternett3y ago

Perhaps if you're a marketer anything can be defined loosely.

rimliu3y ago

Having a world model

Analog243y ago

1 more reply

rootusrootus3y ago

Intent.

nomel3y ago

I think three pieces are missing for intelligence. In the order that they'll probably be implemented:

attention, intent, free running continuous input/feedback (aka, consciousness).

Analog243y ago

What is your intent?

wolpoli3y ago

> Where is IBM's Watson in all this? It seems as if it never existed?

Nowadays, IBM's Watson is simply a brand name for any AI/ML related products under IBM.

habosa3y ago· 9 in thread

vadansky3y ago

I hate being the bearish guy during the hype cycle, but I think a lot of that is just anthropomorphizing it. They fed it TBs of human text, it spits out human text, we think it's humanesque.

Of course maybe I'm wrong and it's AGI and it will find this comment and torture me for for insulting it's intelligence.

rootusrootus3y ago

> I hate being the bearish guy

It's a pretty handy technology, to be sure. But it's still just a tool.

anotherman5543y ago

>"I don't understand it, but it says smart-sounding things, and I don't understand the human brain either, so they're probably the same, it must be sentient!"

This perfectly summarize so much of the discourse around GPT.

Except people lack the humility to say they don't understand the brain, so instead they type "It works just like your brain," or "Food for thought: can you prove it isn't just like your brain?"

danaris3y ago

Yeah; there's way too much "humanity of the gaps" here recently.

rimliu3y ago

I really liked the analogy to a student given in this wonderful article: https://www.newyorker.com/tech/annals-of-technology/chatgpt-...

int_19h3y ago

habosa3y ago

robwwilliams3y ago

LoL: how many of us are trying to kiss up to our future AGI overlords or children?

I think almost all in HN space would confidently assert that there is no AGI lurking in GPT4+. But add the right higher order modules and self-controlled recursion and Bingo.

olddustytrail3y ago

It's more like you feed a million cows into a meat grinder, then into a sausage machine, and then weirdly what appears to be a mooing cow comes out the other end.

It's weird it works when you know how it works.

alkonaut3y ago· 9 in thread

What I wonder most is how it encodes knowledge/state other than in the sequence of queries/responses. Does it not have a "mind"?

tel3y ago

Yes, you absolutely can. I just did it and despite an initial warning that it couldn't play the game (always fun to see these) it absolutely did.

(Edit, also worth noting that some evidence suggests GPT, including 4, is pretty bad at randomly drawing numbers.)

nicpottier3y ago

Try making bad guesses and see it fall apart. (IE, guess something higher than your last guess when it says "too high") It is fully dependent on your part in the charade to keep the act up.

tel3y ago

Except, GPT is smarter than that. Even an inconsistent prompt is still more likely to have some kind of nonsense in the same vein as the asking.

mithr3y ago

ryanwaggoner3y ago

It's more complicated than that, because it can competently answer many questions that it has a 0% chance of having seen before.

1 more reply

nicpottier3y ago

alkonaut3y ago

Shouldn’t it be a reasonable (and pretty simple) addition to just have a secret scratchpad - an inner monologue - where the bot is free to add context which is not “published”?

tel3y ago

I sure don't see why not. You could probably even write a prompt like this.

GPT-4: I have selected a number between 0 and 100. Here is the number in a different language: सत्ताईस. Now, please start making your guesses, and I will guide you accordingly.

Then, ultimately,

Me: please translate your original commitment to this number and prove to me I guessed correctly

GPT-4: My original commitment to the number was written in Hindi: सत्ताईस. When translated to English, it reads "thirty-six." You correctly guessed the number I chose. Congratulations!

Except, amusingly, Google Translate tells me सत्ताईस translates to 27.

Gunnerhead3y ago

You definitely can play that, I just did with GPT 4 as I got curious from your comment.

GPT: All right, I have a number between 1 and 10 in my mind. Go ahead and guess!

Me: 2

GPT: Too low! Try again.

Me: 9

GPT: Too high! Try again.

Me: 5

GPT: You got it! The number I had in mind was 5. Well done!

Zetice3y ago· 7 in thread

Does anyone have a good recommendation for a book that would cover the underlying ideas behind LLMs? Google ends up giving me a lot of ads, and ChatGPT is vague about specifics as per usual.

olooney3y ago

[1]: https://www.amazon.com/Deep-Learning-Python-Francois-Chollet...

[2]: https://en.wikipedia.org/wiki/Transformer_(machine_learning_...

danenania3y ago

Not a book, but here's a really good explanation in blog post form from Stephen Wolfram:

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-...

Zetice3y ago

I do not trust that man one iota.

seizethecheese3y ago

The blog post is very good.

cjblack3y ago

Why?

1 more reply

ww5203y ago

To understand LLM from ground up, the following topics would help.

  - Machine Learning basics. e.g. weight parameters being trained.  
  - Neural Net basics.  
  - Nature Language Processing basics.
  - Word vectorization, word embedding. e.g. Word2Vec.
  - Recurrent Neural Net basics.
  - LSTM model.
  - Attention and Transformer model.
  - Generative model like GAN.
  - Generative Pre-trained Transformer.

I might miss a few topics. Actually ask ChatGPT to explain each topic. See how far it goes.

giardini3y ago

Or you could save some time by clicking on:

"What Is ChatGPT Doing … and Why Does It Work?"

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-...

graciously provided above in this discussion by danenania.

As seizethecheese asserts, also above, "The blog post is very good."

zackmorris3y ago· 5 in thread

On the other hand, many people who are not ready to change, who do not have the skills or who cannot afford to reeducate are threatened.

ulnarkressty3y ago

Random human chiming in. This sounds like a lot of anxiety, something between FUD and FOMO. FUDMO?

Seriously tho, taking some time to get away from it would be good. Ignorance is bliss, this too shall pass etc.

(btw nice piece of writing, you should do it more often!)

tim3333y ago

It was the best of times, it was the worst of times...

In the long run tech does a bit too well with "food in their belly" to the point that obesity is the main problem in the English speaking world.

As to programming it's quite cool getting chat GTP to write code and stuff. If you can't beat it make use of it I guess.

IKLOL3y ago

Method-X3y ago

throwbadubadu3y ago

I think this is a bit hard .. and also unfair to repeat that embrace-change-mantra, because what he says is as absurd as at the same time totally true (:

I'd hope some of us would just be there in 60 years to just tell the future: "Heee just embrace it, ya know" .. nuff said.

stareatgoats3y ago· 4 in thread

This article seems credible and actually made me feel as if I understood it, i.e. at some depth but not deeper than a relative layperson can grasp.

lm284693y ago

> What I can't understand is how the Bing chatbot can give me accurate links to sources but chatGPT4 on request gives me nonsensical URLs in 4 case of 5

The bing version might run a bing query, fetch the X top pages, run GPT on it, return a response based on what it read, and in the back assign the summary to the source

stareatgoats3y ago

That might be the reason, probably. I mostly wanted to complain TBH. But I'm assuming it's one of those wrinkles that will get ironed out in subsequent versions.

rootusrootus3y ago

> It doesn't matter in the cases where I ask it to write a program: the verification is in the running of it.

stareatgoats3y ago

Absolutely! It is seldom correct right off the bat.

sirwhinesalot3y ago· 3 in thread

swframe23y ago

sirwhinesalot3y ago

You are absolutely right, that's the more in depth explanation as to why it's not just an overly complicated markov chain.

At the same time, "meaning" here is essentially "close together in a big hyperdimensional space". It's meaning in the same way youtube recommendations are conceptually related by probability.

And yet, the output is nothing short of incredible for something so blunt in how it functions, much like our brains I suppose.

robwwilliams3y ago

Yes: this comment is one the mark wrt “a component of AGI” just like Wernike’s and Broca’s areas of neocortex are modules needed for human cognition.

pwdisswordfishc3y ago· 3 in thread

Not that much to explain, really. Just read chapter 5 of https://uefi.org/sites/default/files/resources/UEFI_Spec_2_8...

mrwnmonm3y ago

Wha?

GuuD3y ago

Guid partition table. Older of GPT brothers, more of a blue collar guy

mrwnmonm3y ago

Man, I got interested already, could you please stop the suspense and just say how are they related?

1 more reply

davesque3y ago· 2 in thread

I'd be interested in hearing people's takes on the simplest mathematical reason that transformers are better than/different from fully connected layers. My take is:

    Q = W_Q X
    K = W_K X
    A = Q^T K = (X^T W_Q^T) (W_K X) = X^T (...) X

oceansea3y ago

Do you see what I mean?

davesque3y ago

Sounds interesting, but I'm really asking more of a technical question here than a philosophical one. Your comment seems a bit more high level than what I'm going for.

ben77993y ago· 1 in thread

I asked it which was better, Lisp or Almonds.

It said that was an impossible comparison like Apples and Oranges.

Then I asked it which were more similar, Apples & Oranges or Lisp & Almonds.

It seems like it has far more trouble with weird questions like this that even a small child will instantly figure out than it does with anything that seems like a lookup of information.

losvedir3y ago

I hear people with these outrageous examples and I have to wonder how or what you're actually trying? I just asked GPT-4 (via chat.openai.com):

> Which is more similar, Apples & Oranges or Lisp & Almonds?

and it said:

And the question about which is better had a reasonable answer, too. This is GPT-3.5 but GPT-4 was roughly equivalent:

> It's not really possible to compare the relative merits of Lisp and almonds, as they are fundamentally different things.

jokoon3y ago· 1 in thread

I am not convinced that Chat GPT could "think" if it had as many neurons or parameters as a human brain, and got as much training.

I would still be interested to see what it could do, if it did, but I don't think it would really help science understand what intelligence really is.

Being able to grow a plant and understand some conditions that favors it is one thing, but it's poor science.

Maybe there will some progress when scientists will be able to properly simulate the brain of an ant or even a mouse, but science is not even there yet.

seydor3y ago

> I don't think it would really help science understand what intelligence really is

Neuroscience is nowhere near finding out the connectome of a whole human brain so why not, we should look into these models as hints about what our circuits do.

pillowtalks_ai3y ago· 1 in thread

It is still funny to me that so much emergent behavior comes from some simple token sampling task

poulsbohemian3y ago

Maybe this ChatGPT stuff is "smarter" than I've been giving it credit.

charles_f3y ago

I commend the author for one of the clearest explanations I've seen so far, written to explain rather than impress. Even an idiot like myself understood what is explained.

Two things that I felt were glanced over a bit too fast were the concept of embeddings and that equation and parameters thing. Consider elaborating a bit more or giving an example

ianpurton3y ago

If you pefer to see it in code there's a succint gpt implementation here https://github.com/LaurentMazare/tch-rs/blob/main/examples/m...

danesparza3y ago

At least part of this article is contradicted by Chat GPT itself. From the article:

"...Ongoing learning: The brain keeps learning, including during a conversation, whereas GPT has finished its training long before the start of the conversation."

From ChatGPT 4.x:

pyinstallwoes3y ago

So it’s basically the alchemical geometry of gematria and Isopsephia? Kinda cool that they’re similar in method.

LispSporks223y ago

I think it's the "The Paperclip Maximizer" scenario, not "The Paperclip Optimizer"

oblio3y ago

https://old.reddit.com/r/ChatGPT/comments/10q0l92/chatgpt_ma...

tabtab3y ago

Would it be a stretch to call GPT "glorified Markov Chains"? (I used tweaked M.C. once to make a music composer bot. I actually got a few decent tunes out of it, kind of a Bach style.)

slawr18053y ago

This was a great read! Especially for a beginner like me.

rfmoz3y ago

I’ve been looking an article like this, great job. Thanks

j / k navigate · click thread line to collapse