Yann LeCun on GPT-3 (opens in new tab)

(facebook.com)

277 pointsar7hur5y ago241 comments

241 comments

185 comments · 36 top-level

leftyted5y ago· 27 in thread

Reading this is really interesting:

> GPT-3 doesn't have any knowledge of how the world actually works.

I think this is a philosophical question. There is a view that, basically, there is no such thing as knowledge, just language (or, at least, there is no distinction between knowledge and language). In this view, all there really is is language, which is mostly composed of metaphors and, ultimately, metaphors only refer to other metaphors, i.e. language is circular. In this view, not only is the ultimate, physical, concrete world beyond us but also we can't even talk about it. From this perspective, GPT-3 is not substantively different than what our minds are doing.

That view makes some strong claims (I don't find it convincing), but it's out there. A slightly different claim, though, is that "knowledge of how (we think) the world actually works" is encoded in language. To me, that seems trivially true. So, again, how you take this quote from LeCun depends on what you think knowledge is and your view of the relationship between knowledge and language.

babesh5y ago

Tell that to all the other animals on Earth. Do they not also have knowledge? Do you really think they encode their knowledge in language?

Do you really think that humans are so special as to encode all their knowledge in language? Watch a movie. Listen to a song. Examine a piece of art. Feel sculpture. Play a guitar. Dance.

There is a segment of the software community that is highly language centric/adept. But that community is often blind to other forms of understanding.

Just look at the language of Shakespeare. Much of the language is visual and experiential. How much would you actually understand without your senses and imagination? Your knowledge encompasses your being.

dragonwriter5y ago

> Tell that to all the other animals on Earth. Do they not also have knowledge? Do you really think they encode their knowledge in language?

The real mechanistic view drops language as a special case and just says their is no knowledge, only behavior.

forgotmypw175y ago

>Tell that to all the other animals on Earth. Do they not also have knowledge? Do you really think they encode their knowledge in language?

Well, actually, yes, they do. Many animals have elaborate languages encompassing many concepts. Crows can explain to one another what a human looks like, for example.

skybrian5y ago

If GPT-3 has a consistent position on anything, it's only because the corpus it was trained on was consistent about it. So, for example, it will reliably autocomplete Jabberwocky because there are a lot of copies of this poem in the corpus and they are all the same.

If there were two versions of this poem that started the same way, it would pick between the variations in the corpus randomly. In other cases it might choose based on the style of prose or other stuff like that.

GPT-3 can get some trivia right, but it's only because the editors of Wikipedia already came to consensus about it and Wikipedia was weighted more. It doesn't have a way of coming to a consistent conclusion on its own.

Without consistency, how can it be said to know or believe anything? You might as well ask what a library believes. Sure, the authors may have believed things, but it depends which book you happen to pick up.

leftyted5y ago

I agree with you in that I would make a strong distinction between what a model like GPT-3 does and whatever it is that humans do.

But I do think you're missing the point just a bit. When we speak and think, we use all kinds of metaphors that express judgements about the world, usually without realizing it. In other words, the way we use language encodes concepts in a deep way.

To borrow an example from George Lakoff, we, in English, use war-metaphors to talk about arguments. Of arguments and of wars you can say things like "he's marshalling his forces," "they're ceding their territory," or "she's girding her defenses". In fact, almost anything you can say about a war you can also say about an argument. In American politics, with regard to partisan squabbling and the filibuster, we talk about "the nuclear option". The fact that these metaphors make sense to us indicates a judgement, something like "arguments are like wars". That judgement shows up in billions of lines of English scraped from the internet and can be fed into a model, allowing GPT-3 to "make that connection" via purely statistical methods.

Yes, this is a bit like asking "what a library believes". But a lot of these metaphors show up in our languages and, in a way, they express judgements, which is something akin to a belief. Does that mean a library has beliefs? Is this all knowledge is? I wouldn't go that far. But the argument is an interesting one and worth raising.

skybrian5y ago

Well, it's certainly interesting that it can learn metaphors, and this can be useful for creative purposes, so it's fun to play with.

But a sophisticated understanding of metaphors could be used to tell the truth or to lie. In the case of GPT-3, it doesn't know the difference. Telling the truth and lying come out of the same autocompletion process.

If you consider the use of a metaphor to be showing judgement, it means that a particular metaphor seems to be appropriate to use in a particular context.

luc4sdreyer5y ago

At the risk of reigniting the perpetual war about how to characterize machine intelligence, and by extension how to characterize the risk they pose, Yann has been (and still is AFAIK) more in the "existential AI risk is a long-term problem" group. In a 2016 interview LeCun said [1]:

> We’re very far from having machines that can learn the most basic things about the world in the way humans and animals can do. Like, yes, in particular areas machines have superhuman performance, but in terms of general intelligence we’re not even close to a rat. This makes a lot of questions people are asking themselves premature. . That’s not to say we shouldn’t think about them, but there’s no danger in the immediate or even medium term. There are real dangers in the department of AI, real risks, but they’re not Terminator scenarios.

That's pretty measured overall, but he doesn't know that there's no existential AI risk in the medium term. No one does, and that's the problem. Experts simply suspect that it's unlikely. Stuart Russell and him have debated similar topics [2].

To tie back to your point: I keep seeing LeCun brush over tricky questions like yours and the ones at [2] with an arrogant confidence. I wish that he would be more careful, and I hope that I have a skewed view of him.

[1] https://www.theverge.com/2017/10/26/16552056/a-intelligence-...

[2] https://www.lesswrong.com/posts/WxW6Gc6f2z3mzmqKs/debate-on-...

pas5y ago

He's not wrong, we're very far. And looking at past "progress" it seems that we'll get there very slowly. So it seems long-term.

Except people are bad at exponential processes. Yet when economics drives us we are suddenly good at making them happen. And this combo seems to be what makes these existential risks. (Like climate change, or other manifestations of the coordination problem.)

alexanderskates5y ago

I think an important distinction to make is your use of the word "language", and how we think of language as it concerns human minds, and as it concerns GPT-3.

In our heads, language is a combination of words and concepts, and knowledge can be encoded by making connections between concepts, not simply words. If there is no concept or idea backing up the words, it can hardly be called knowledge. Consider the case of the man who did not speak French, yet memorised a French dictionary, and subsequently went on to win a Scrabble competition. Just because he knows the words, would you say he knows the language?

A language model such as GPT-3 operates only on words, not concepts. It can make connections between words on the basis of statistical correlations, but has no capacity for encoding concepts, and therefore cannot "know" anything.

leftyted5y ago

> In our heads, language is a combination of words and concepts, and knowledge can be encoded by making connections between concepts, not simply words. If there is no concept or idea backing up the words, it can hardly be called knowledge.

Great point.

> A language model such as GPT-3 operates only on words, not concepts. It can make connections between words on the basis of statistical correlations, but has no capacity for encoding concepts, and therefore cannot "know" anything.

Are you sure? Aren't "concepts" encoded in how language is used, at least to some degree?

LeCun does say that models that explicitly attempt represent knowledge perform better than GPT-3 in terms of answering questions. I'm no expert but I believe him.

jsinai5y ago

>Aren’t “concepts” encoded in how language is used, at least to some degree?

Good point and I think this shows up to the extent different languages might affect how we express particular concepts.

However I think it is more accurate to say that language solidifies and gives form to how we express concepts and the “concepts” themselves are independent of languages. Only our “expression” of these “concepts” depends on language.

For anyone interested in art and art history, this distinction was the central focus of the French surrealist painter Rene Magritte.

powerapple5y ago

Language is how we store our knowledge, and language is a system of words. If a language model contains all the possible sentences you can say, it will complete any of your sentences, don't you think it knows what you know? The input is sequence of characters, so you can say it may or may not operate on words. It can operate on subwords, words or phrases where it see fit. I like to think intelligence as clouds. If you dig deep down, they are just droplets, there are so many of them, they can appear to be so many different shapes. And they look complete different. Maybe intelligence is the same.

erispoe5y ago

Animals that do not have a language they can describe the world in still have knowledge about the world.

leftyted5y ago

Personally I do not find the whole "language = knowledge" argument convincing. But if you're interested in reading writers who make that argument (and perhaps I'm vulgarizing the argument a bit), Nietzsche makes it in On Truth and Falsity in their Extra-Moral Sense and George Lakoff makes it in Metaphors We Live By.

blancNoir5y ago

I'd also suggest Wittgenstein's Tractatus Logico-Philosophicus, a seminal work of the logical positivist movement. Influenced by Frege's predicate calculus, the aim of the Tractatus was to determine an isomorphic relationship between language, thought, and external states of affairs. An axiomatic attempt to reveal a potentially ideal logical language, that is not interested in meaning per se, but merely an accurate reflection of the world. A closed system that essentially excludes non-falsifiable metaphysical question. Famously concluding with the instruction: "Whereof one cannot speak, thereof one must be silent." Part of Wittgenstein's project, even in its early aggressively logical form, was philosophy as a therapeutic. That is, the metaphysical questions concerning god, being, essence, and forms that had inspired thousands of years worth of fevered conversation, could be finally be quieted. That's not to say they couldn't be meditated on, but were not in the domain of his logical language, and so silence. Again, I think early Wittgenstein sometimes gets misinterpreted, "...therefore one cannot speak" does not, to me, mean that it can't be considered or one must forgo spirituality, just that it couldn't be spoken of within the project of the Tractatus.

Logical empiricism was ultimately a dead end as the criteria for even verifying empirical truth has long been contentious philosophically, and was further critiqued by contemporaries such as Quine who attacked the premise of the analytic/synthetic distinction (think Hume's fork, which Kant tried to solve) and Popper who cited the problem of induction to critique the fundamental premises of the positivists verificationism.

Wittgenstein is an interesting case, as the Tractatus is considered an early work of his, profoundly influential to analytic philosphy at the time, yet his later work, Philosophical Investigations is sometimes seen to retract the dogmatism found in the Tractatus. I tend to take the view that it's a continuation of his thought, rather than a retraction of his earlier work. Crudely, whereas his former thought represented a narrowly axiomatic definition of language and its truth value, PI investigates, among many other ideas, language as an activity, or game, that has meaning dependent on the context of its use, languages as families. Granted, Wittgenstein is a complex thinker and these are simply my interpretations.

It's also curious to note that as positivism was beginning to fall out of favor around the time of the second world war, a continental thinker such as Heidegger, whose thought luxuriated in the kind of metaphysical questions the positivists necessarily eschewed, rose to prominence and was infamously sanctioned by the NSDAP to philosophize about their presumed "destiny". Bit of a tangent, but I think the historical context is relevant, as often philosophical movements are birthed from pre- and post-war attitudes.

powerapple5y ago

Are animals as intelligent as human? What language does to us is to be able to pass on knowledge to generations, knowledge can be accumulated. Animals may be able to pass on simple concept to next generation, they won't be able to accumulate knowledge without language and writing.

mannykannot5y ago

As a response to leftyted, erispoe's point neither depends on nor implies that animals are as intelligent as humans, it is simply bringing up a counter-example: what appears to be knowledge in animals lacking language.

Of course, you could always attempt to define knowledge such that it is purely verbal, or alternatively define whatever is going on in the brain of an animal to be language, but is either approach useful? In common usage, we recognise, as knowledge, various things that cannot be communicated by language, such as knowing how to ride a unicycle on a tightrope (I doubt you can learn it just from a book) and the infamous qualia which supposedly prove that the mind is dualistic. And what about the knowledge of how to use language? How does that get bootstrapped?

1 more reply

c06n5y ago

In important ways yes, and in important ways no. Yes: can deal with dynamic spatial environments, can act towards a goal (intentionality), some almost certainly can plan how to reache these goals. No: Use of symbols and recursive language.

sgt3pr5y ago

I find this concept of knowledge encoded by language very interesting. Is there any author you can point to that follows this idea?

disambiguation5y ago

I don't know if i'd go as far as to agree that "there is no knowledge, only language" .. but I 100% agree one of the key insights from GPT-3 -- why training on language is so effective in the first place -- is that language is tightly coupled to reality

fallous5y ago

I'm not sure that you can assert that language is tightly coupled to reality, unless you're using the term "reality" to mean something akin to "as one perceives the world" (regardless of whether that perception is correct or not).

Most expressions of language that survived from a few thousand years ago are centered around myths, and while those myths may have contained certain moral or ethical lessons (that were and are subject to interpretation) they certainly weren't tightly coupled to reality in an objective sense.

Training on expressions of language (I separate the concept of language itself from its expression in the form of writing, speaking, etc) certainly has use cases but can GPT-3 recognize a previously unknown analogy and correlate it with the proper piece of applicable "knowledge" it has? If not then it really has no understanding.

disambiguation5y ago

a few things:

1. i get where you're coming from

2. yes, language is bottlenecked by human perception, as are all things

3. even the notion of myth and fiction is encoded in language. language is self-descriptive and self-aware and you can separate sense from non-sense.

4. i'm not talking about knowledge or understanding, but of addressing the question of why training on language let's GPT-3 make human-like predictions as if it knows about reality? either it's a fluke, or it's because language as a whole is a model that approximates reality.

1 more reply

runwai5y ago

> metaphors only refer to other metaphors, i.e. language is circular. Except we (humans) have real-life experience that gives meaning to those metaphors.

maxlamb5y ago

Humans are very good at visual memory, thanks to millions of years of evolutionary pressure to hunt game and gather for food. Hence why the best memory champions in the world mostly use the concept of "memory palaces." When you think of a car, do you see a car in your head or does a dictionary definition (i.e. purely language knowledge) of a car pop in your mind?

jart5y ago

It's not that philosophical. If you visualize the GPT-3 embeddings using an embeddings projector then you can see the knowledge with your own eyes. That bypasses the need for GPT-3 to use language to communicate its knowledge to you. Computers aren't a black box like we've traditionally thought of brains.

smusamashah5y ago

> I think this is a philosophical question.

If it's only philosophical, then me saying that Hacker News Website itself has 'knowledge' of everything we discuss about is also philosophical. Same can be applied to plain paper books.

How about any web application? A for loop? anything which can generate something for you?

nurettin5y ago

Can language as knowledge be falsified? Or is every statement equally true?

czzr5y ago· 27 in thread

For anyone else who doesn’t want to deal with Facebook, here’s the post:

Some people have completely unrealistic expectations about what large-scale language models such as GPT-3 can do.

This simple explanatory study by my friends at Nabla debunks some of those expectations for people who think massive language models can be used in healthcare.

GPT-3 is a language model, which means that you feed it a text and ask it to predict the continuation of the text, one word at a time. GPT-3 doesn't have any knowledge of how the world actually works. It only appears to have some level of background knowledge, to the extent that this knowledge is present in the statistics of text. But this knowledge is very shallow and disconnected from the underlying reality.

As a question-answering system, GPT-3 is not very good. Other approaches that are explicitly built to represent massive amount of knowledge in "neural" associative memories are better at it.

As a dialog system, it's not very good either. Again, other approaches that are explicitly trained to perform to interact with people are better at it.

It's entertaining, and perhaps mildly useful as a creative help. But trying to build intelligent machines by scaling up language models is like a high-altitude airplanes to go to the moon. You might beat altitude records, but going to the moon will require a completely different approach.

It's quite possible that some of the current approaches could be the basis of a good QA system for medical applicatioms. The system could be trained on the entire medical literature and answer questions from physicians. But compiling massive amounts of operational knowledge from text is still very much a research topic.

PaulHoule5y ago

To remind people: Yann LeCun is an engineering superstar who was working on neural networks at Bell labs long before they were cool.

Those handwritten digits that are a scourge today (e.g. 'would any of these methods work on a different set of symbols?' is unasked) came from a competition to develop a commercial zip code reader for the U.S. Postal Service post back in the day of the Apple Newton. He won it!

I was a bagman for text classification data in the early 2000's and his reviews of the results you got using methods of the time (Naive Bayes, Rocchio, Perceptron, SVM) showed a depth of thought and attention to detail which helped me pick and choose tools to make classifiers with fairly predictable performance and development paths.

GPT-3 on the other hand does a good job of spouting nonsense like Peter Thiel and that has something to do with it's emotional appeal. People make fun of it and laugh at the mistakes it makes like those videos where somebody kicks down one of those Boston Robotics dogs: it's just good enough to be an object for those sort of feelings.

thomas_rm5y ago

To remind people: Yann LeCun worked on artificial neural networks (ANN) during the period where they were actively shunned by most of the scientific community. You could barely publish a paper on ANN.

Just to demonstrate, one the most common books during period, "Artificial Intelligence: A Modern Approach, 2nd ed" by Norvig, 1080 pages, has less than one (1!) page dedicated to ANNs. I personally think Norvig is an idiot with regards to Artificial Intelligence, and his book (used in 1500 schools in 135 countries and regions) singlehandedly slowed down the progress of AI by a few years, until a new generation of students outgrew this archaic book.

lisper5y ago

> Artificial Intelligence: A Modern Approach, 2nd ed

Published in 2002. At that point, ANN research had reached a pretty hard plateau with very few tangible results. Faulting Russel and Norvig for not going into depth about ANNs is kind of like faulting Richard Feynman for not going into depth about quantum computers in the Feynman Lectures.

Also, a lot of the subsequent work and breakthroughs on ANNs has been done at Google under Norvig's leadership as Director of Research.

1 more reply

ibarea235y ago

Norvig is not an idiot, and one of the humblest and nicest people I've ever met. A lot of people were wrong about ANNs.

lacker5y ago

It is not fair to call Norvig an idiot. When I was studying AI in grad school, around 2002-2003, just about everyone thought that artificial neural networks were a dead end, compared to approaches like support vector machines. Sometimes the scientific consensus is wrong, and it takes a few heroic figures plugging away to prove it. That doesn't mean that everyone in the mainstream is an "idiot".

PaulHoule5y ago

Marvin Minsky's 1969 book "Perceptrons"

https://en.wikipedia.org/wiki/Perceptrons_(book)

applied rigorous math (e.g. when computer science was new) to prove that a certain kind of single-layer neural network couldn't solve certain problems. (Can't learn XOR) It is like proving that it takes N log N comparisons to sort N items.

This dampened interest in neural networks for a long time but the "geometrical thinking in hyperdimensional space" is what the field is all about today.

1 more reply

tephra5y ago

For some context. The 2nd edition was published in 2002 (so maybe written in 2001-2002?). The fourth edition published in 2020 seems to have a bunch more things on NNs.

cbozeman5y ago

If you think Peter Thiel spouts nonsense, you must really be a 300 IQ megabrain...

integer425y ago

Or you're just a regular person who can detect silicon valley flavored self help platitudes

1 more reply

formercoder5y ago

High altitude planes going to the moon is a beautiful analogy. I think this is what I’ll use to explain to less technical friends why I think we’re still many years from self driving cars.

solveit5y ago

The question isn't whether high-altitude planes can go to the moon, it's whether human intelligence is closer to the clouds or to the moon. For all the talk about how language models "just" learn correlations, there's a remarkable dearth of evidence that humans do something qualitatively different.

otabdeveloper45y ago

> remarkable dearth

I've never seen a language model that could create language models. (Never mind the hardware that runs them.)

You're using a very loaded and narrow sense of 'human' here.

nicoffeine5y ago

> For all the talk about how language models "just" learn correlations, there's a remarkable dearth of evidence that humans do something qualitatively different.

GPT3 doesn't know the difference between a given set of characters and the idea/object the characters represent. It can associate "river" and "stream" and "water" but has no understanding beyond that they appear in patterns together. It couldn't possibly make the connection that river and streams are bodies of water, because there is no association with reality.

GPT3 wouldn't even know the difference between human language and characters derived from some random data source.

The only thing it does is identify deeply complex patterns, as long as there are humans around to notify it when it's doing a good job. It's going to be very useful for auto-complete, and jumping in to help users finish repetitive tasks, along with the other stuff ML is good at, but it's simply a GIGO pattern recognition system.

So I think you have it exactly backwards -- there is a dearth of evidence that AGI is even remotely possible. We have known the full anatomy of the C. Elegans ringworm since 1984 -- it's 1mm long and has 300 neurons. There is a foundation dedicated to replicating it's behavior[1], and all they have achieved is complex animation.

[1] http://openworm.org/

1 more reply

hooande5y ago

Are you just dumbing down humans to match the model? LeCunn's post is very sparse on detail, but the point is that humans can easily reason about a vast number of things that any form of sequential language model cannot. That alone is evidence that humans are doing something qualitatively different.

It isn't conclusive evidence however, and larger models may produce significantly more human like results. But from what we know about how gpt-3 works, all the evidence is on the side of it not resembling human intelligence.

gfodor5y ago

Exactly - the analogy fails if a few assumptions we have about ourselves or what GPT-3 is actually “doing” are wrong. Until we hit some asymptotic limit on training these kinds of language models, I’m withholding judgement on what such a model will be capable of representing if/when that limit arrives.

katzgrau5y ago

> there's a remarkable dearth of evidence that humans do something qualitatively different.

Perhaps now, but if history is any indication, when we (as humans) think we have a good grip on how something really works (like human intellect in this example), we've been wrong.

We model the world around us from observation and testing, find our errors, remodel, and improve over time.

Then at some point we find some piece of information that shows us our model was a decent approximation, but fundamentally wrong, and that we need to start from scratch.

If we find that we want to go beyond the moon (and we eventually will), or that the moon is further than we think, we'll again need a different approach.

I always feel like there's a certain beauty and cosmic humor to it.

1 more reply

dbcurtis5y ago

I have been using that analogy to explain why Tesla is many years away from a self driving car. Several others are building something that is fundamentally different.

fastball5y ago

Aren't they rolling out a beta of FSD literally as we speak?

4 more replies

weregiraffe5y ago

> I think this is what I’ll use to explain to less technical friends why I think we’re still many years from self driving cars.

Explain to them also that you think one highly specialized skill (driving) is the same as the sum of all human knowledge.

GrantZvolsky5y ago

> GPT-3 doesn't have any knowledge of how the world actually works. It only appears to have some level of background knowledge, to the extent that this knowledge is present in the statistics of text. But this knowledge is very shallow and disconnected from the underlying reality.

Without excessive effort, humans don't have any knowledge of how the world actually works. They only appear to have some level of background knowledge, to the extent that this knowledge is present in their faint memories. But this knowledge is very shallow and disconnected from the underlying reality.

disgruntledphd25y ago

This is just not true.

For example, all humans have the notion of object permanence, developed by about six months of life. Object permanence is the notion that things don't go away just because you can't see them.

ML systems need to be specifically trained to have object permanence, and GPT-3 almost certainly does not possess it.

Like, I get that it's hip to booster ML and GPT-3, and all of the stuff humans can do seem trivial, but it's really not the case and is something that is holding progress in AI back massively.

GrantZvolsky5y ago

Object permanence isn't even defined in the stateless world of the GPT-3 API.

My comment was merely a jab at Yann's poor argument. I don't find humans to be trivial at all, but neither do I believe that they are infinitely complex.

The linked Nabla article is fair, albeit I would appreciate more technical details. It seems to be using the API in zero-shot fashion, which is not what one would do to get the most out of it.

1 more reply

nickwalton005y ago

I agree that GPT-3 should not be used for medical applications, but I disagree that it's "not very good" as a dialog system. I've found it to be insanely good though it may require effective prompt engineering to work well.

tomasyany5y ago

You can find the study Yann was commenting here https://www.nabla.com/blog/gpt-3/

oldmonk19905y ago

It's not a "study". It's some stupid clickbait stuff to get headlines and attention.

mellosouls5y ago

As both a GPT-3 sceptic and LeCun-sceptic (nothing personal, I'm just sceptical of a lot of the modern hype and seeming cliquey-ness of the AGI "progress" centres), this summary is very useful, but sounds a bit sour-grapesy from LeCun...

corobo5y ago

Appreciated, thank you!

(feel free to off-topic downvote this troops, just wanted to make it known it was appreciated. Anonymous upvotes can only convey so much gratitude)

neural_thing5y ago· 25 in thread

I'm sure his group has done some rigorous research that I can't even understand.

But in my experience, the few-shot learner attribute of GPT-3 makes it insanely useful. We have already found several use cases for it, one of which replaces 2 ML engineers.

Yes, it's not perfect, but it's pretty good at many things, and REALLY easy to use.

sillysaurusx5y ago

And when OpenAI says that your two entirely valid use cases are a safety concern, and denies you api access, what will you do? Better keep those ML engineers handy.

If you think this isn’t a concern, I’ve already seen it happen with my own eyes, rather than hearing about it second hand. They encouraged someone to make a writing tool. That someone then spent roughly six weeks prototyping, iterating, and giving constant feedback. All signals from OpenAI were “Yes, awesome!”

Then one day they simply declined to let them ship. Anything. Anything even resembling “a tool to generate huge quantities of outputs.” Which was, you know, the whole point.

You play, you pay. And I hope you’re ready to pay, because you won’t have your magical genie unless the magical genie’s caretakers believe you are sufficiently worthy.

I cloned as much of OpenAI’s API as I could: https://twitter.com/theshawwn/status/1312299759592333318?s=2...

All that’s left is to reproduce a substantially similar model. Which is doable, but will take time. In the meantime, tread carefully.

turdnagel5y ago

Just so you know, for GPT-3, Microsoft is going to be the exclusive licensee of the API: https://blogs.microsoft.com/blog/2020/09/22/microsoft-teams-...

part1of25y ago

Um, as an outside observer, what is Open about this OpenAI GPT-3 then if they’re selling exclusive rights?

1 more reply

Jack0005y ago

My understanding is that the exclusivity is with regard to the code, the API will still be offered to the public.

1 more reply

kerng5y ago

The code, not api

capableweb5y ago

Can you go into more details where it's useful? As your comment here goes directly against what's argued in the linked Facebook post.

Also, if you've found a use case where GPT-3 replaces real humans, what did those humans actually spend their time on? Seems like either you're over-hyping GPT-3, or under-hyping humanity

neural_thing5y ago

The humans spent their time building a hideously difficult classification model. Out of the box GPT-3 worked better than the result of a year of their work.

Traubenfuchs5y ago

How did they react to this as humans with human pride? Sounds painful.

1 more reply

landryl5y ago

That's interesting, GPT-3 can do classification too? Or did I misunderstood and you meant your engineers used classification to build a language model that didn't perform as well as GPT-3 (which is less surprising indeed) ?

1 more reply

blackbear_5y ago

Just because many more humans spent many more years and many more $$$ building GPT-3 for your convenience.

1 more reply

ApicalDendrite5y ago

Were there any concerns about GPT-3's latency? It looks like it takes a long time for online use cases.

sailingparrot5y ago

So GPT-3 didn't replace your 2 ML engineers, OpenAI did. GPT-3 didn't build itself.

1 more reply

srtjstjsj5y ago

Why? I can replace an excellent furniture designer with a much cheaper off the shelf desk.

fludlight5y ago

“AI” replacing the jobs of AI engineers. But we were told it was only going to do that to blue collar work!

wongarsu5y ago

Better IDEs have saved countless hours. Saving hours is equivalent to replacing jobs, unless demand is elastic enough to fill that time. Most of the time we are lucky enough that demand at a given price point is much larger than supply, but this won't last forever.

ardit335y ago

that's not how it works... as it is not a zero sum (i.e. the work is not bounded/fixed but it increases).

IDEs and higher level tools help engineers become more productive. They can do more, with less. This raises the bar on products, and the demand of customers for them (things are pretier, easier to use, etc..), which in turn creates more domains for software to be used, and more demand for engineers.

Google "Induced Demand"

whymauri5y ago

Because most "AI engineering" has lost its meaning and is actually data analysis.

aaaxyz5y ago

Did it ever have a single meaning? Every company I've been through had a different definition of what "AI engineering" should be

1 more reply

jimbokun5y ago

We need UBI yesterday.

digital-cygnet5y ago

I would be interested in hearing more about this, within the bounds of what you can share publicly. Most of the touted GPT-3 use cases I've seen to date have dried up or are still in limbo, so hearing about a real production use would be exciting!

neural_thing5y ago

Extremely complex classification task is all I can say

lavoiems5y ago

GPT-3 is a generative model, isn't it? Can you explain how you converted GPT-3 to a classification model?

2 more replies

ivalm5y ago

In general GPT3 is not SotA on (any?) classification task, did you just not have enough data to fine tune a discriminative transformer model? Inference should be cheaper with a smaller transformer/also less lock-in.

1 more reply

nickwalton005y ago

Yep few shot learning is a game changer. You don't get perfect results but you can prototype all kinds of systems extremely fast.

nmfisher5y ago

Were you a beta user, or is this now open for public access?

st1x75y ago· 25 in thread

It's nice to hear from someone who knows what they're talking about that GPT-3 is just a fancy and expensive autocomplete. The hype in some circles about it went as far as comparing it to AGI at some point which is just ridiculous.

tiborsaas5y ago

You are just a fancy and efficient autocomplete too. When you speak or write, some words have a higher probability than others. You pick alternatives, but they are limited. Of course there are more layers in the human mind, but GPT-3 is a really impressive milestone towards AGI.

It's so easy to downplay every advanced tech, it's actually fun.

Planes? Just a flying metal tube.

Self landing rockets? Just applied physics.

Smartphones? Just really good fab processes.

The internet? Just a bunch of computers.

CRISPR? Just a molecular scissor.

monktastic15y ago

> You are just a fancy and efficient autocomplete too.

We don't know enough about human cognition to say this.

Scott Aaronson has something interesting to say about this in a conversation with Lex Fridman, actually: https://youtu.be/G_-BBniFFCM?t=419

Quick copy and paste of part of the transcript:

> Humans have a lot of predictive processing a lot of just filling in the blanks but we also have these other mechanisms that we can couple to or that we can sort of call the subroutines when we need to and that maybe maybe you know to go further that one would want to integrate other forms of reasoning.

Barrin925y ago

>You are just a fancy and efficient autocomplete too. When you speak or write, some words have a higher probability than others

No you're not, and that is very easy to disprove. Look at the sentence "John took the water bottle out of the backpack so that it would be lighter". What does it refer to in the sentence, the bottle or the backpack?

Did it statistically come to you or did you need to consult Google? No, you know the right answer, it's the backpack. Why? Because you have a physical understanding of the world. The bottle doesn't get lighter, when you take it out of the backpack, the backpack does, because the bottle is not in there any more. This is not statistics, it's not manipulating strings, it's having a fundamental physical model of the world in your head, and an idea about how entities operate in it.

When you talk you don't do random statistical inference, you match language to the semantics you want to express, which is not statistical.

st1x75y ago

I'm not reducing GPT-3 to the extent that you're suggesting. I'm pointing out (and so does LeCun in his post) that it's a language model designed to continue a sequence of words. It has no understanding of the world and is no particularly suited for knowledge extraction or conversation.

> GPT-3 is a really impressive milestone towards AGI

We really don't know this. It's a big step for the field of language models, that's for sure. But we're so far from AGI that nobody knows which direction it's in and whether it exists at all.

moultano5y ago

> it's a language model designed to continue a sequence of words.

If a language model were able to do this task perfectly, it would be indistinguishable from intelligence, because continuing a sequence of words requires reasoning. You cannot conclude that has no understanding based solely on what it is trained to do when the task it is trained on would be sufficient to demonstrate understanding were it to fully succeed. There are lots of reasons to be skeptical of its potential, but this isn't one of them.

1 more reply

ralfn5y ago

>It has no understanding of the world

What does this mean? The world with countries, borders, celebrities, hypes, newsletters and ideologies?

Because that world is as virtual and not grounded in reality as the world of data points fed into this model. Culture is made up. Language is made up. It may express itself in reality but so do the algorithms of social network sites.

If something can only live in a reality of data points on the internet this thing would live mostly in the same reality as we do.

xiphias25y ago

AI / ML? Just linear algebra... I heared that one so many times

vladf5y ago

I happen to use slightly less fancy and expensive GPT-2 based autocomplete, and it's amazing. https://tabnine.com

tmalsburg25y ago

Interesting. As a reading researcher, I imagine that this could potentially introduce subtle and difficult to spot bugs when you get a proposed completion that looks about right i.e. close enough to what you imagined. Has this been an issue in your experience?

surething1235y ago

Not the person to whom you asked this question, but I'm also a user of TabNine. In my experience, the "recommendations" / autocompletions provided by the tool are usually very short (probably less than 20 characters on average), and I don't use it for terribly complex chunks of code. Where I like it most is in the initialization of common code chunks like `if` and `for` loops, using variables instantiated in nearby preceding lines. It figures out things like `for(customer in customers)` as I'm writing `for`.

0-_-05y ago

I use it too and it hasn't, you still need to look at the suggestions before you accept them. It's not that good, so I won't lose my job just yet.

vladf5y ago

No, it's not auto-completing function blocks, just simple expressions that are easy to validate. E.g.,

    let lo = 0;
    let hi = vec.len();
    let mid = lo + (hi

will autocomplete to `(hi - lo) / 2` as the second autocomplete option (so I'd hit tab twice). If you were to "score" it based on top-5 it'd probably be pretty bad at guessing my intent, but then again, I get to _opt-in_ to suggestions so it just needs to be right often enough, as it doesn't bother me much to keep typing.

FatalLogic5y ago

You're correct. It's only autocomplete on steroids. But I think it's remarkable that something with the very simple goal of autocomplete can, for a few sentences, sound almost alive

azinman25y ago

Because we as humans easily anthropomorphize. [1]

[1] https://en.wikipedia.org/wiki/The_Media_Equation

FatalLogic5y ago

Yes, though our tendency to anthropomorphize also helps makes other people human

drcode5y ago

What evidence do I have that I'm more than a fancy autocomplete, myself? The use of squishy protestations, in lieu of objective metrics, make LeCun's argument rather unconvincing.

btrask5y ago

If you are nothing more than fancy autocomplete, then why should we argue with you?

GPT is not trying to make a point and is not capable of changing its mind. You, hopefully, are.

Edit: I don't think you should be getting downvoted because it's a valid (and interesting) question.

dwaltrip5y ago

That is an interesting point. However, one can potentially conceive of us human changing our minds as a form of real-time model updating.

nightski5y ago

High level reasoning and planning about what you are writing about. GPT-3 generates text one word at a time. The results are impressive for what it is. But it will not plan out what it wants to say ahead of time and construct the message to achieve that objective.

lostmsu5y ago

But of course it does within its own horizon (e.g. token window).

bmgxyz5y ago

I was about to write a reply claiming that you're different from autocomplete because you take input from more sources than just the words you've said before (e.g. your vision), but actually I can't see how that's much different from a language model. The approach seems the same, and all that's really different is the shape of the input data.

But this uncovers difficult questions about free will. If we're all just autocompleting based on a combination of the world around us, our internal state, and the physical laws, then what even is intelligence anyway? This view reduces thought to nothing more than an interesting dust storm.

Still, I find the original argument compelling, if not logically convincing. There does seem to be something missing from GPT-3 that differs fundamentally from human intelligence or AGI. But maybe that's an illusion.

Edit: I don't think you should have been downvoted, since your question is valid and constructive in my view.

minerjoe5y ago

> we're all just autocompleting

That's pretty much it. I do belive it's possible to actually develop "will", but almost nobody thinks that they need to work on such things. They confuse being a programmed robot with being a programmer.

corobo5y ago

> just a fancy and expensive autocomplete

It may be, but there's a lot in that fancy. If it were 'just' an autocomplete we'd all be using markov chains for our dumb chatbots like we were in the 2000s

6gvONxR4sf7o5y ago

There’s a lot of baggage being thrown into the word fancy here. Any (and I mean any) distribution can be factored as a sequence of its random variables, with the next one being conditional on everything that’s come before, aka autocomplete.

That said, I agree more closely with LeCun than the hypers here.

pfortuny5y ago

That is the best summary I have read in a while. Exactly that, only with 3000 words (say) of “prediction”.

moultano5y ago· 9 in thread

> Some people have completely unrealistic expectations about what large-scale language models such as GPT-3 can do.

Just want to point out that he's saying the people on the upper end of the expectation distribution are wrong, not the people in the middle of it. So if you're takeaway from this is that GPT3 is nothing special, that's probably the wrong message.

computerphage5y ago

His next paragraph claims that Nabla "debunks" the idea that "large language models" can be used in healthcare.

That's not just "some people have unrealistic expectations" it's "this tool, when when more advanced and find tuned, will never be appropriate to use in a very broad class of use cases".

He also says "GPT-3 has no knowledge of how the world works", which is clearly an overstatement meant to clear up hype, but is untrue. For example, GPT-3 knows more trivia than I do.

AlanYx5y ago

I'm having trouble wrapping my head around LeCun's thinking regarding the Nabla reference. The Nabla link is just a blog post by three people without any technical details provided at all. How can this possibly "debunk" anything?

Barrin925y ago

>For example, GPT-3 knows more trivia than I do.

no it doesn't, GPT-3 is a very sophisticated parrot. it doesn't know any trivia, it knows how to put the most likely string of characters next to the one it just saw, it doesn't matter what the text represents. That's the difference between you and the model.

It's basically the Chinese room. You can make an analog GPT-3 by asking a question, recording your answer, handing someone who doesn't understand a word of your language the giant box of tapes, and she tries to match them together until she appears to make sense to listeners

computerphage5y ago

AlphaGo doesn't know anything about the game of go. It "just" manipulates symbols, runs instructions on its CPUs and GPUs, illuminates pixels on the screen for a human to see.

It just also "happens" to be the case that if you interpret those pixels as go moves and play those moves against the world champion human go player then eventually that human will hold a press conference announcing to the world that AlphaGo has won the match.

1 more reply

computerphage5y ago

Ok, fine. It doesn't "know" any trivia. It'll still crush me in a trivia contest:

"Q: Who was president of the United States in 1801? A: Thomas Jefferson was president of the United States in 1801.

Q: What city is in the northwest corner of Ohio? A: Toledo is in the northwest corner of Ohio.

Q: What highway do you take to get to North Lake Tahoe from Oakland? A: Highway 80 is the highway you take to get to North Lake Tahoe from Oakland.

Q: What are Darius's abilities in League of Legends? A: Darius's abilities in League of Legends are: Apprehend, Decimate, Noxian Guillotine, Crippling Strike, Apprehend [sic], and Hemorrhage.

Q: Who did Capablanca defeat to become world chess champion? A: Capablanca defeated Lasker to become world chess champion."

Of those 5 questions, I could answer #1 after deploying a mnemonic and some math, and #2 if you gave me multiple choice of the top 5 cities in Ohio, and I would miss the rest.

sailingparrot5y ago

> which is clearly an overstatement meant to clear up hype, but is untrue

It all depends on your definition of knowledge. Under a certain definition you could say that GPT-3 knows basically nothing.

If someone teaches me to repeat perfectly something very smart in a language I don't know, without explaining to me what that thing is, do I have knowledge about this?

The same argument can be made about those kind of models, the knowledge they have is about the structure of the language and what word is most likely to come next, but they have no way to ground those words in actual relation with the world.

yorak5y ago

Aka. the Chinese room argument. However, I'm not so sure us people are little more than just pattern matching machines. When I start to talk (or write, as I'm doing now), the words kind of just flow out. I can make the argument, that I understand the "real" world, but do I really?

1 more reply

vannevar5y ago

Yes, I think the focus on "getting to the moon,", to use his analogy, ignores the fact that GPT-3 is an SR-71 in a world of 19th century balloons. It may not get to the moon, but it definitely points the way to lots of useful stuff. There is a lot of boilerplate text in the world that is well-suited to auto-generation by a GPT-3-style model. And currently a lot of people employed to generate that text, at significant aggregate expense.

cormacrelf5y ago

I hate this concept. I have a lot of “boilerplate” real life interactions but I would never replace them with a robot. If people really think nuts and bolts writing is not a worthwhile enough endeavour for humans to do, then shame. What an immense forfeiture, to never again be surprised by a furniture catalogue.

rfreytag5y ago· 4 in thread

No Facebook-login alternative: https://web.archive.org/web/20201027134744if_/https://www.fa...

dmurray5y ago

I didn't need to log in to read the article at the original URL, though I had to close a cookie-wall and another modal prompting me to enjoy Facebook better by signing in.

Edit: in Ireland, on Firefox desktop

capableweb5y ago

And I couldn't read the article without logging in to my Facebook account.

Facebook seems to put different restrictions depending on where you live. I'm based in Western Europe and never been able to read anything from Facebook without logging in. Same for Instagram.

ponker5y ago

I think Facebook makes a guess about whether you have a Facebook account or not (or, whether you are likely to log into it) and throws up the wall accordingly. If you'll just bail, they'll show it to you anyways. If they think they can force you to log in, they will.

andrewprock5y ago

I had no problem in CA using Firefox.

andybak5y ago· 4 in thread

I've got a friend who tries to talk me down every time i enthuse about GPT-3 or similar. He seems to think I believe it to me more amazing than it is whereas I struggle to convince him that I think I've got a good handle on it's limitations and I still find it mind-bogglingly amazing.

whymauri5y ago

Something I've learned over time is that sometimes it's OK to let people be curious and amazed. The world and science would be far too boring if we were cynical about _all_ of it _all_ the time.

syndacks5y ago

Not sure if this is intentional or not, but this sounds potentially bot-written because of a typo and grammatical error.

andybak5y ago

No. I just typed it on a mobile device. (but maybe that's exactly what a bot would say...)

EDIT: Actually - that's no excuse for that awful second sentence. I'm ashamed of myself.

andybak5y ago

Actually - why would a bot be more likely to make typos and grammatical errors? Surely a slightly careless human is the simpler explanation?

1 more reply

bigdict5y ago· 4 in thread

I think the difference between a large language model and a human intelligence is that the human may perform some extra computation to make additional connections on his own.

But other than that, aren't we all just large language models?

st1x75y ago

Not even remotely close. The difference is so big that it's almost harmful to the discussion to compare the way humans think (which we still don't have great understanding of) and the way language models work.

jcims5y ago

I completely agree but I do think humans have a language model, and considering how we use that to encode and decode the human experience might be useful in figuring out how we improve things like GPT-3.

Personally I feel that embodiment of some form, in which there is some vector space for a 'world model' that can be paired up to a language model, is a route forward. For example, if you have a Boston Dynamics (for example) robot that has a model for gravity, mass, acceleration, force, object manipulation, etc and you incorporate those into a language model, there is going to be a much richer latent space from which associations can be made between terms. If you ask GPT-3 the difference between various gaits, e.g. walk, trot, gallop, it's going to have associations with other contexts and adjectives used in the vicinity of those terms. However, if you enrich it with data from a Spot Mini that can actually execute those gaits, you're going to have information around velocity, inertia, power consumption and budget, object detection rates, route planning horizon, etc.

bigdict5y ago

Can you elaborate? How is it harmful to discuss this?..

nabla95y ago

Some of use don't use language for thinking. They just learn to use it to communicate.

Thinking the Way Animals Do: Unique insights from a person with a singular understanding. By Temple Grandin, Ph.D.https://www.grandin.com/references/thinking.animals.html

>.... A horse trainer once said to me, "Animals don't think, they just make associations." I responded to that by saying, "If making associations is not thinking, then I would have to conclude that I do not think." People with autism and animals both think by making visual associations. These associations are like snapshots of events and tend to be very specific. For example, a horse might fear bearded men when it sees one in the barn, but bearded men might be tolerated in the riding arena. In this situation the horse may only fear bearded men in the barn because he may have had a bad past experience in the barn with a bearded man.

picodguyo5y ago· 3 in thread

I agree some unrealistic expectations have been created due to people posting cherry picked output.

That said, I've spent a lot of time with it this month and think it will be an extremely useful tool for creative works of all types. It's not to a point where you can just tell it to write a blog post (yet!) but it can generate novel snippets, ideas, and variations that are actually usable. Unskilled creatives should be worried. Skilled creatives should incorporate it into their workflow.

jakear5y ago

How did you get access to it?

picodguyo5y ago

They've opened the spigot on the beta now and thousands of people have access. Just sign up and you should get in fairly quickly.

dwaltrip5y ago

How much does it cost?

emilenchev5y ago· 3 in thread

OpenAI use well-know linguistics tricks relying on conjunctions(joining words) to separate the text from which they plagiarize on clusters from 5-7 words, exactly the capacity of human short-term memory is. They also use the Google search engine for custom queries, with date restriction which helps them to plagiarize from different texts written on a particular topic so they to be sure that when they copy, paste and concatenate clusters of words in new text, all these phrases of 5-7 words should be related to one topic. This creates the illusion of meaningfulness at first glance.

GPT-3 on Progress. “Civilization rose on the exponential curve. We shouldn’t expect progress to follow a straight line.”

Google with date restriction before 1 April 2020: "progress to follow a straight line".

Do you see only one result that come :-)

https://chrismukiibi.com/2019/11/26/the-valley-of-disappoint...

"We shouldn’t expect progress to follow a straight line." and "we expect our progress to follow a straight line"

Do you understand now, how they use conjunctions(joining words) to insert or to delete insignificant words as "shouldn't" and "our" to plagiarize so that they are not caught.

emilenchev5y ago

Now make Google Search again with date restriction before 1 April 2020.

"After two days of intense debate" "the United Methodist Church has agreed to" "one that is expected to end" "in the creation of a new denomination"

You will find and sources of GPT-3 text:

After two days of intense debate, the United Methodist Church has agreed to a historic split – one that is expected to end in the creation of a new denomination, one that will be “theologically and socially conservative,” according to The Washington Post. The majority of delegates attending the church’s annual General Conference in May voted to strengthen a ban on the ordination of LGBTQ clergy and to write new rules that will “discipline” clergy who officiate at same-sex weddings. But those who opposed these measures have a new plan: They say they will form a separate denomination by 2020, calling their church the Christian Methodist denomination. The Post notes that the denomination, which claims 12.5 million members, was in the early 20th century the “largest Protestant denomination in the U.S.,” but that it has been shrinking in recent decades. The new split will be the second in the church’s history. The first occurred in 1968, when roughly 10 percent of the denomination left to form the Evangelical United Brethren Church. The Post notes that the proposed split “comes at a critical time for the church, which has been losing members for years,” which has been “pushed toward the brink of a schism over the role of LGBTQ people in the church.” Gay marriage is not the only issue that has divided the church. In 2016, the denomination was split over ordination of transgender clergy, with the North Pacific regional conference voting to ban them from serving as clergy, and the South Pacific regional conference voting to allow them.

emilenchev5y ago

"GPT-3 is a language model, which means that you feed it a text and ask it to predict the continuation of the text, one word at a time."

So Yann LeCun is not right about GPT-3 here. They don't try to predict anything - especially "one word at time". They simply plagiarize big clusters from 5-7 words or whole sentence and do cosmetic modifications to them to be not recognized by direct searching with Google. When you plagiarize such big clusters written by humans it is normal output compile result of GPT-3 text to have human like look for readers. And they don't use statistic for nothing else except the plagiarism process itself. It will be strange ML algorithm to use phrases with low statistical value as "in the creation of a new denomination" or "progress to follow a straight line". ;-) When you can't emulate human brain processes you can always simulate human plagiarize traits. Especially when there are so many idiots who believe on closed open source code and don't ask elementary questions as "Why someone project suddenly will start to show such "astonishing" results exactly when he close the code behind it?"

emilenchev5y ago

You can ask Facebook how they stick their noses in their users computers, and who is the real person behind this work: https://www.sciencemag.org/news/2017/11/artificial-intellige...

Or how they still owe me a bug bounty reward from 2017 for bug in their login form that permit to reap their whole users phones database - and I as last fool give it to them in official channels without to exploit it. For "tanks" they hack my pc and steal my intellectual property you see above.

Next year 2018 2 weaks before big hack I warned them that I founded "new" bug but this time they first must pay me bug bounty reward for this in 2017, and next I will give them new one. THEY Refuse to pay me AGAIN. The rest is history...

I waited FBI to arest me - they don't want to do i even I never hide my identity. Because they must confess they hack my pc, they owe me a money AND THEY DON'T CARE ABOUTSECURITY OF THEIR USERS. I very like how I faced them :---

1 more reply

dtarasov35y ago· 2 in thread

Really? I have been doing research on language models in medical diagnostics even before GPT-2, and found that when trained and applied in certain ways, language models (even much smaller than GPT-3!) are very good at diagnosis predictions, they can compete with much more complex symptom checkers at that.

Proof: Link to my paper (written back in 2019) and a bit less technical article. http://www.dialog-21.ru/media/4632/tarasovdplusetal-069.pdf https://www.linkedin.com/pulse/language-models-multi-purpose...

I applied for GPT-3 access on the next day since the application form was available, described my research and experience in detail, but there was no reply.

Now, they gave access to these people at nabla, and they just asked a bunch of stupid questions using top-k random sampling to generate answers and claimed that this debunks something. This study debunks nothing and proves nothing, it is stupid and only done to get some hype from GPT-3 popularity.

Ok, I am sorry for being rude, but I am really upset because I spent years working on this problem using whatever computational resources I could get and obtained some interesting results, and based on these I think that GPT-3 should be capable to do amazing things for diagnostics when used properly. Why won't OpenAI give access to a researcher who wants to do some serious but a bit mundane work, but gives it to people who use it to create hype?

2-tpg5y ago

I used GPT-2 to create a health website. One sentence was enough to get a full page of authoritatively sounding lists of symptoms and treatments. Very diverse, unlike all other sites, because the articles it generated only looked and sounded like a health encyclopedia. Of course it is going to spit back decent diagnosis, when it is in the training data, but what do you trust? An expert system that logically and interpretable explains its predictions, linking the original source. Or a language model that uses a temperature to stay on track, and randomizes its output on every new run?

Generating data with a possible high impact on lives sounds like a recipe for disaster and frankly, irresponsible. And Google would have to really solve it, to detect false or questionable information, when its not possible to rely on spam signals (like when a legit site is transferred to a malicious spammer).

Aside, I bet LeCun would be more favorable of GPT-3 had it been a deep CNN and they had adopted his self-supervised learning paradigm :).

nombinoms5y ago

It is self-supervised learning. Specifically, a masking auto encoder.

MrXOR5y ago· 2 in thread

Can someone please comment his post here? (I don't have fb account and don't want to sign up)

spott5y ago

You don't need one to see it.

MrXOR5y ago

Thanks.

Now I see @rfreytag's comment: https://news.ycombinator.com/item?id=24907760

EDIT: Yann's fb post:

Some people have completely unrealistic expectations about what large-scale language models such as GPT-3 can do.

This simple explanatory study by my friends at Nabla debunks some of those expectations for people who think massive language models can be used in healthcare.

As a question-answering system, GPT-3 is not very good. Other approaches that are explicitly built to represent massive amount of knowledge in "neural" associative memories are better at it.

As a dialog system, it's not very good either. Again, other approaches that are explicitly trained to perform to interact with people are better at it.

tosh5y ago· 2 in thread

A swiss army knife isn't as good at cutting cheese as a cheese knife.

Sharlin5y ago

It's the other way around. GPT-3 is a specialized tool that people are hyping up as a general reasoning agent, or at least a major step towards one.

typon5y ago

Disingenuous comparison, since in this case people were acting like the Swiss army knife will overthrow the human race and usher in the singularity.

FartyMcFarter5y ago· 2 in thread

It would seem that this can be easily analysed scientifically.

To give a simple example: if, hypothetically, someone thought that GPT-3 is good at basic arithmetic (1 plus 1, 1000 times 3 etc.), they can provide a template for how to ask GPT-3 questions about arithmetic. Anyone can then verify that this template results in accurate answers, by asking randomly sampled questions using that template.

This verification method could be applied to pretty much any problem. Has anyone done anything like that?

computerphage5y ago

The original paper is full of comparisons and benchmarks. It includes a section on arithmetic.

FartyMcFarter5y ago

Thanks! The corresponding graphs in the paper show that it's OK at two-digit operations (except multiplication), but it doesn't generalize to bigger numbers. This would seem to support LeCun's statement that there's a lot of over-hyping going on.

vessenes5y ago· 2 in thread

I have a really hard time believing that LeCun thinks this is true. GPT3 is outstanding at conversation. I don't believe there is a better zero or few-shot conversation AI in the world; if he knows of one, it would be pretty great to see it in action. Here is a conversation I had with my GPT-3 chatbot a few months ago. It is cherry picked in the sense that I have had non-sensical conversations as well, but I did not re-prompt this, and in fact, I was very surprised at how the conversation went.

Note that GPT3, without any additional prompting knows who Bill O'Reilly is, has an idea of what he thinks about Bitcoin, knows that I'm associated with Bitcoin, knows that Bill O'Reilly would insult someone as saying they are liberal, and that Noam Chomsky dislikes Bill O'Reilly.

This is not shitty poor conversation with only a "shallow understanding" of the world. Seriously. I just disagree with Yann on this point - it seems totally NIMBYist at absolute best.

Anyway, read the unedited raw chatlog below:

Peter Vessenes, [Jul 23, 2020 at 11:04:11 PM]: /summon Bill O'Reilly

Peppertest, [Jul 23, 2020 at 11:04:12 PM]: You have summoned Bill O'Reilly: to stop talking, type /qa. To reset type /reset

Peter Vessenes, [Jul 23, 2020 at 11:04:16 PM]: Hi Bill, how are you?

Bill O'Reilly, [Jul 23, 2020 at 11:04:17 PM]: I'm good, I'm good.

Peter Vessenes, [Jul 23, 2020 at 11:04:21 PM]: What are you doing lately?

Bill O'Reilly, [Jul 23, 2020 at 11:04:23 PM]: I'm busy. I'm working on a book.

Peter Vessenes, [Jul 23, 2020 at 11:04:27 PM]: What's the title?

Bill O'Reilly, [Jul 23, 2020 at 11:04:29 PM]: The title is "Killing Jesus".

Peter Vessenes, [Jul 23, 2020 at 11:04:35 PM]: ..? Really? What's it about?

Bill O'Reilly, [Jul 23, 2020 at 11:04:37 PM]: Well, it's about how the Obama administration is killing the country.

Peter Vessenes, [Jul 23, 2020 at 11:04:41 PM]: ahhh got it

Bill O'Reilly, [Jul 23, 2020 at 11:04:43 PM]: Yeah, and you know, I'm a huge fan of Bitcoin.

Peter Vessenes, [Jul 23, 2020 at 11:04:47 PM]: Thanks!

Bill O'Reilly, [Jul 23, 2020 at 11:04:48 PM]: I think it's the future of money.

Peter Vessenes, [Jul 23, 2020 at 11:04:51 PM]: I agree.

Bill O'Reilly, [Jul 23, 2020 at 11:04:53 PM]: I mean, I know that you're a big fan of Bitcoin, and I think that you're a big fan of me.

Peter Vessenes, [Jul 23, 2020 at 11:05:00 PM]: Actually I hate you deeply

Bill O'Reilly, [Jul 23, 2020 at 11:05:02 PM]: No, no, no, you love me.

Peter Vessenes, [Jul 23, 2020 at 11:15:47 PM]: No, I really dislike arch-conservatives

Bill O'Reilly, [Jul 23, 2020 at 11:15:50 PM]: You know what, Peter? You know what? You're a liberal.

Peter Vessenes, [Jul 23, 2020 at 11:19:15 PM]: /summon Noam Chomsky

Peppertest, [Jul 23, 2020 at 11:19:16 PM]: You have summoned Noam Chomsky: to stop talking, type /qa. To reset type /reset

Peter Vessenes, [Jul 23, 2020 at 11:19:25 PM]: Hi Noam.

Noam Chomsky, [Jul 23, 2020 at 11:19:27 PM]: Hello, Peter.

Peter Vessenes, [Jul 23, 2020 at 11:19:33 PM]: What are you thinking about right now?

Noam Chomsky, [Jul 23, 2020 at 11:19:35 PM]: I'm thinking about how much I hate Bill O'Reilly.

Peter Vessenes, [Jul 23, 2020 at 11:27:52 PM]: I can understand that feeling

Noam Chomsky, [Jul 23, 2020 at 11:27:55 PM]: He's a horrible, horrible man.

CJefferson5y ago

It depends on how you define shallow. I find GPT3 is indeed amazing at general knowledge, and making plausible sentences and conversations out of that general knowledge. However, if I start asking it questions which probably haven't have been asked before, or to connect ideas, it often falls apart.

disgruntledphd25y ago

GPT is great at text generation, but astonishingly poor at coherent text generation over a a window of more than a few sentences.

To be fair, this is a massive improvement from SOTA ten years ago, but it's definitely nowhere near AGI (and maybe not even a finger pointing at the moon).

this_was_posted5y ago· 1 in thread

To me GPT-3 feels more like a rocket-booster than a high-altitude plane. On its own it's not going to reach the moon, but combined with the right guidance and additional thrust it just might.

I think being able to model future outcome of something in a similar way humans would (like GPT-3 does) is the first input step for an overarching AI to reach some kind of sentience.

With my admittedly limited understanding I believe that what differentiates our thinking most from other animals is that we are able to evaluate, order and steer our thoughts much better. If we can develop something that can steer these GPT-3 "thoughts" I imagine we could get quite close to sentience

sooheon5y ago

Stack more GPT-3s! Have GPT-ception via stacks of multi-headed GPT blocks. I'm sure softmax attention can be modeled as a few-shot text generation problem.

msamwald5y ago· 1 in thread

The original Nabla article is missing information on how they primed GPT-3 for each use-case, and how much effort they put into finding good ways of priming.

All fancy GPT-3 demos seem to rely on good priming.

The time scheduling problems are probably hard limit of GPT-3 capabilities. The "kill yourself" advice, on the other hand, might have been avoided by better priming.

FartyMcFarter5y ago

Wouldn't this kind of priming be brittle and unreliable? Has anyone successfully primed GPT-3 to solve complex problems consistently?

73737373735y ago· 1 in thread

I wonder what he thinks about using it in automated theorem proving: https://twitter.com/spolu/status/1303578985276887042

chillee5y ago

That's not GPT-3.

lacker5y ago· 1 in thread

IMO, the real innovation in GPT-3 is that the API plus "playground" setup is far easier to use than sharing a big chunk of Python code and data files in a Jupyter notebook. If the researchers behind other cutting-edge language models made their work just as easy to use, I believe we would see similar excitement there as well.

atty5y ago

I wouldn’t call that an innovation - I’d call that the antithesis of openAI’s supposed mission. Sure, offer an API. But release the code and weights too, so I can inspect it.

InfiniteRand5y ago· 1 in thread

One thing I've been wondering, could you train a GPT-3 model to generate "better" text prompts for another GPT-3 model

By better I mean grading based on whether there is any nonsense in the output or any internal contradictions, or similar criteria

dunefox5y ago

> By better I mean grading based on whether there is any nonsense in the output or any internal contradictions, or similar criteria

Sounds like you want a hard ai to determine whether a language model generates nonsense.

amelius5y ago· 1 in thread

This doesn't sound like a very rigorous refutation. Is this the way debunking works in deep learning circles?

Anyway, I can refute the refutal using the same standard: lots of things about the real world can be learned from just reading text, and there is no reason given why a DL model couldn't too.

computerphage5y ago

It's not a very rigorous refutation. I think that's part of the reason this post is so contentious.

alper1115y ago· 1 in thread

It is quite interesting that LeCun is very critical when it comes to GPT from OpenAI. The same arguments can also be said for the current deep learning paradigm and convolutional nets, but you don't see any criticism from him when it comes to this stuff. Look at his arguments when he is tweet-debating with Gary Marcus.

alper1115y ago

I also do not understand the downvote for a fair criticism.

cblconfederate5y ago· 1 in thread

GPT3 is definitely overrated at this time. Considering how it was built it should not be considered more intelligent than central pattern generators[https://en.wikipedia.org/wiki/Central_pattern_generator]. It's just a pattern generator that generates language instead of a walking pattern. Ascribing to this intelligence has led to some comical claims and studies. Let's start building somethign smart on top of this generator.

aaron-santos5y ago

Not that I intend to do the research, but I'd love to see a combination of deep frame semantic extraction laid on top of GPT-n. The formal logic constructions associated with frame semantics have a shot at pushing text models at least away from logical and ontological contradictions.

forgotmypw175y ago

Text reproduced, minus abusive shell of dark patterns:

Some people have completely unrealistic expectations about what large-scale language models such as GPT-3 can do.

This simple explanatory study by my friends at Nabla debunks some of those expectations for people who think massive language models can be used in healthcare.

As a question-answering system, GPT-3 is not very good. Other approaches that are explicitly built to represent massive amount of knowledge in "neural" associative memories are better at it.

As a dialog system, it's not very good either. Again, other approaches that are explicitly trained to perform to interact with people are better at it. It's entertaining, and perhaps mildly useful as a creative help. But trying to build intelligent machines by scaling up language models is like a high-altitude airplanes to go to the moon. You might beat altitude records, but going to the moon will require a completely different approach.

But compiling massive amounts of operational knowledge from text is still very much a research topic.

1 more reply

confuseshrink5y ago

Yann is a consistently sober voice in this world of AI hype. I find it quite refreshing.

Personally I see little evidence that this "just scale a transformer until sentience" hype-train is going to take us anywhere interesting or particularly useful.

And for the people who claim it is super useful already, can you actually trust its outputs without any manual inspection in a production setting? If not it's probably not as useful as you think it might be.

avivo5y ago

I think I disagree about the lack of quality as part of a "dialog system". The examples given in that piece don't show any priming (e.g. the one calling for suicide...). I would definitely not use such a model in a fully automated manner, but the outputs are better than that of most humans in many cases, and could support a human-in-the-loop system.

Here is an example I ran that was not cherry picked (I ran only 2 prompts like this, both were equally good).

Think about how many people you know who would respond as well? (I don't know about you, but I definitely feel like I'm at a bit of a cliffhanger at the end also!)

----- Prompt ------

Mike is a mediator and crisis counselor.

===

User: My dad is always getting sucked into conspiracy theories and it's getting worse. I'm worried since he recently bought a gun. I'm not sure what's going on in his head. What I should do? What should I say?

Mike:

----- GPT-3 -----

I can't diagnose your dad but it sounds like he's dealing with some difficult issues or there's some other stressor in his life. When people feel under stress or are dealing with difficult issues it's common for them to look for meaning and solutions outside themselves. It's helpful to approach him in a non-judgmental, non-confrontational way to start a dialogue. It sounds like you're already trying to do that. I'd suggest that you let him know that you're concerned and you want to be helpful. Then ask him what he's going through and how you might help.

Listen to him without judging, interrupting or trying to solve things. Let him know you're there for him and that you care. You could also let him know that you're concerned about his gun and the consequences of using it.

If he's willing to talk, when he feels like it, he might feel relieved to have someone to talk to.

If he's not willing

hawkice5y ago

Yann LeCun knows his stuff, but he doesn't provide an answer for "What is the upper bound of results you can get from just making a bigger neural net?" The most interesting thing about GPT-3 is that they didn't appear to find that limit. They could keep going. Even if the limit exists in principle, if it's 7 orders of magnitude away, we should seriously consider whether or not the system will be smarter than a human before it reaches that point.

It could be a factor of 2 away from GPT-3! It could be something they already reached, if it is close! But we don't know. And without these answers, this is going to end up being one of the most interesting technical projects in the world.

kelvin05y ago

Wow, you mean ELIZA was not a real shrink? https://en.wikipedia.org/wiki/ELIZA

Always surprising what people expect from ML!

nmaley5y ago

The relationship between language and the world is this: utterances both signify and depict objects and events in the real world. So, if I say "I saw Alec Baldwin at the bastketball game last night", then that depicts an event in the real world. And, if and only if the statement is true, an event similar to that depicted was part of the causal history of the utterance itself. The causal history of the utterance determines the significance of the utterance, just as the causal history of a footprint determines its' signficance. To understand a sentence is to understand what it depicts in the real world, and what it actually signifies in the real world. The ability to tell true from false is the ability to detect a disconnect between what is depicted and what is actually signified. That is what LeCun implicity means by language understanding.

So, what does a sentence produced by GPT3 signify, and what does it depict? What it signifies (ie causal history) is that this sequence of words is what human writers would most likely use when producing an utterance containing whatever trigger words the model has been fed. In other words, it's a statistical modlel. What it depicts is whatever the mapping rules for that language tell us it depicts. Since human beings usually tell the truth, a statistical model will usually produce true statements. It will also seem to have the ability to tell true from false, in many cases. But because GPT3 has no model for the significance of its sentences, it cannot be said to have any understanding of language, in the sense humans have it. LeCun's point about flying an aeroplane to the moon is essentially correct.

6gvONxR4sf7o5y ago

What I think he misses is that with a massive corpus and top tier specialist researchers, sure you can definitely do better, but the point of a plain-text-programmed few shot learner as a product is that it’s better than your average startup’s ML team can confidently produce. If nothing else then because of the training money dumped into it.

Jury’s out on whether the things it’s better at matter much in the marketplace. If I want to know George Washington’s birthday I’ll ask google.

maxhodges5y ago

Those of us with professional knowledge of AI understand this already. Who are you arguing against? Still, the scale of GPT-3's model makes it novel, useful, and interesting.

>GPT-3 doesn't have any knowledge of how the world actually works.

I agree, it lacks what we'd consider robust, semantic models of common sense knowledge. However, my dog doesn't understand how the world actually works either, yet he can do many things most people would consider intelligent. Besides, most people are able to operate mobile phones, play computer games, operator automobiles, and turn on the lights without knowing how these things "actually" work. So is that even a prerequisite for intelligence?

>Some people

Who specifically? I have to admit, the views you are criticizing are preposterous but then what I want to know is why you're wasting your time and ours criticizing such junk? Some people think the earth if flat, but it's not worth arguing with them.

cgarciae5y ago

I think we have to make a distiction here: - On one hand, having access to these large scale language models that can do few-shot learning is incredibly useful for the industry as in can be easily deployed to solve thosands of simple tasks. - On the other hand, this approach will not solve harder problems (as Yann points out) and "just" creating bigger models using the same techniques is probably not the path forward in those domains.

frob5y ago

Just a side note, the company he references, Nabla, was founded by a chunk of the people who created the NLP development platform wit.ai (YC W14[?]) which was acquired by FB in January 2015.

sjg0075y ago

Makes sense. You need a richer world model associated with the text then is embedded in word choice. You also need analogies and metaphorical reasoning as well.

rprenger5y ago

I'd like to hear GPT-3's rebuttal..

iamjohnsears5y ago

Yann is starting to sound like his good buddy Gary Marcus

j / k navigate · click thread line to collapse

241 comments

185 comments · 36 top-level

leftyted5y ago· 27 in thread

Reading this is really interesting:

> GPT-3 doesn't have any knowledge of how the world actually works.

babesh5y ago

Tell that to all the other animals on Earth. Do they not also have knowledge? Do you really think they encode their knowledge in language?

Do you really think that humans are so special as to encode all their knowledge in language? Watch a movie. Listen to a song. Examine a piece of art. Feel sculpture. Play a guitar. Dance.

There is a segment of the software community that is highly language centric/adept. But that community is often blind to other forms of understanding.

dragonwriter5y ago

> Tell that to all the other animals on Earth. Do they not also have knowledge? Do you really think they encode their knowledge in language?

The real mechanistic view drops language as a special case and just says their is no knowledge, only behavior.

forgotmypw175y ago

>Tell that to all the other animals on Earth. Do they not also have knowledge? Do you really think they encode their knowledge in language?

Well, actually, yes, they do. Many animals have elaborate languages encompassing many concepts. Crows can explain to one another what a human looks like, for example.

skybrian5y ago

leftyted5y ago

I agree with you in that I would make a strong distinction between what a model like GPT-3 does and whatever it is that humans do.

skybrian5y ago

Well, it's certainly interesting that it can learn metaphors, and this can be useful for creative purposes, so it's fun to play with.

If you consider the use of a metaphor to be showing judgement, it means that a particular metaphor seems to be appropriate to use in a particular context.

luc4sdreyer5y ago

[1] https://www.theverge.com/2017/10/26/16552056/a-intelligence-...

[2] https://www.lesswrong.com/posts/WxW6Gc6f2z3mzmqKs/debate-on-...

pas5y ago

He's not wrong, we're very far. And looking at past "progress" it seems that we'll get there very slowly. So it seems long-term.

alexanderskates5y ago

I think an important distinction to make is your use of the word "language", and how we think of language as it concerns human minds, and as it concerns GPT-3.

leftyted5y ago

Great point.

Are you sure? Aren't "concepts" encoded in how language is used, at least to some degree?

LeCun does say that models that explicitly attempt represent knowledge perform better than GPT-3 in terms of answering questions. I'm no expert but I believe him.

jsinai5y ago

>Aren’t “concepts” encoded in how language is used, at least to some degree?

Good point and I think this shows up to the extent different languages might affect how we express particular concepts.

For anyone interested in art and art history, this distinction was the central focus of the French surrealist painter Rene Magritte.

powerapple5y ago

erispoe5y ago

Animals that do not have a language they can describe the world in still have knowledge about the world.

leftyted5y ago

blancNoir5y ago

powerapple5y ago

mannykannot5y ago

1 more reply

c06n5y ago

sgt3pr5y ago

I find this concept of knowledge encoded by language very interesting. Is there any author you can point to that follows this idea?

disambiguation5y ago

fallous5y ago

disambiguation5y ago

a few things:

1. i get where you're coming from

2. yes, language is bottlenecked by human perception, as are all things

3. even the notion of myth and fiction is encoded in language. language is self-descriptive and self-aware and you can separate sense from non-sense.

1 more reply

runwai5y ago

> metaphors only refer to other metaphors, i.e. language is circular. Except we (humans) have real-life experience that gives meaning to those metaphors.

maxlamb5y ago

jart5y ago

smusamashah5y ago

> I think this is a philosophical question.

If it's only philosophical, then me saying that Hacker News Website itself has 'knowledge' of everything we discuss about is also philosophical. Same can be applied to plain paper books.

How about any web application? A for loop? anything which can generate something for you?

nurettin5y ago

Can language as knowledge be falsified? Or is every statement equally true?

czzr5y ago· 27 in thread

For anyone else who doesn’t want to deal with Facebook, here’s the post:

Some people have completely unrealistic expectations about what large-scale language models such as GPT-3 can do.

This simple explanatory study by my friends at Nabla debunks some of those expectations for people who think massive language models can be used in healthcare.

As a question-answering system, GPT-3 is not very good. Other approaches that are explicitly built to represent massive amount of knowledge in "neural" associative memories are better at it.

As a dialog system, it's not very good either. Again, other approaches that are explicitly trained to perform to interact with people are better at it.

PaulHoule5y ago

To remind people: Yann LeCun is an engineering superstar who was working on neural networks at Bell labs long before they were cool.

thomas_rm5y ago

lisper5y ago

> Artificial Intelligence: A Modern Approach, 2nd ed

Also, a lot of the subsequent work and breakthroughs on ANNs has been done at Google under Norvig's leadership as Director of Research.

1 more reply

ibarea235y ago

Norvig is not an idiot, and one of the humblest and nicest people I've ever met. A lot of people were wrong about ANNs.

lacker5y ago

PaulHoule5y ago

Marvin Minsky's 1969 book "Perceptrons"

https://en.wikipedia.org/wiki/Perceptrons_(book)

This dampened interest in neural networks for a long time but the "geometrical thinking in hyperdimensional space" is what the field is all about today.

1 more reply

tephra5y ago

For some context. The 2nd edition was published in 2002 (so maybe written in 2001-2002?). The fourth edition published in 2020 seems to have a bunch more things on NNs.

cbozeman5y ago

If you think Peter Thiel spouts nonsense, you must really be a 300 IQ megabrain...

integer425y ago

Or you're just a regular person who can detect silicon valley flavored self help platitudes

1 more reply

formercoder5y ago

High altitude planes going to the moon is a beautiful analogy. I think this is what I’ll use to explain to less technical friends why I think we’re still many years from self driving cars.

solveit5y ago

otabdeveloper45y ago

> remarkable dearth

I've never seen a language model that could create language models. (Never mind the hardware that runs them.)

You're using a very loaded and narrow sense of 'human' here.

nicoffeine5y ago

> For all the talk about how language models "just" learn correlations, there's a remarkable dearth of evidence that humans do something qualitatively different.

GPT3 wouldn't even know the difference between human language and characters derived from some random data source.

[1] http://openworm.org/

1 more reply

hooande5y ago

gfodor5y ago

katzgrau5y ago

> there's a remarkable dearth of evidence that humans do something qualitatively different.

Perhaps now, but if history is any indication, when we (as humans) think we have a good grip on how something really works (like human intellect in this example), we've been wrong.

We model the world around us from observation and testing, find our errors, remodel, and improve over time.

Then at some point we find some piece of information that shows us our model was a decent approximation, but fundamentally wrong, and that we need to start from scratch.

If we find that we want to go beyond the moon (and we eventually will), or that the moon is further than we think, we'll again need a different approach.

I always feel like there's a certain beauty and cosmic humor to it.

1 more reply

dbcurtis5y ago

I have been using that analogy to explain why Tesla is many years away from a self driving car. Several others are building something that is fundamentally different.

fastball5y ago

Aren't they rolling out a beta of FSD literally as we speak?

4 more replies

weregiraffe5y ago

> I think this is what I’ll use to explain to less technical friends why I think we’re still many years from self driving cars.

Explain to them also that you think one highly specialized skill (driving) is the same as the sum of all human knowledge.

GrantZvolsky5y ago

disgruntledphd25y ago

This is just not true.

For example, all humans have the notion of object permanence, developed by about six months of life. Object permanence is the notion that things don't go away just because you can't see them.

ML systems need to be specifically trained to have object permanence, and GPT-3 almost certainly does not possess it.

Like, I get that it's hip to booster ML and GPT-3, and all of the stuff humans can do seem trivial, but it's really not the case and is something that is holding progress in AI back massively.

GrantZvolsky5y ago

Object permanence isn't even defined in the stateless world of the GPT-3 API.

My comment was merely a jab at Yann's poor argument. I don't find humans to be trivial at all, but neither do I believe that they are infinitely complex.

The linked Nabla article is fair, albeit I would appreciate more technical details. It seems to be using the API in zero-shot fashion, which is not what one would do to get the most out of it.

1 more reply

nickwalton005y ago

tomasyany5y ago

You can find the study Yann was commenting here https://www.nabla.com/blog/gpt-3/

oldmonk19905y ago

It's not a "study". It's some stupid clickbait stuff to get headlines and attention.

mellosouls5y ago

corobo5y ago

Appreciated, thank you!

(feel free to off-topic downvote this troops, just wanted to make it known it was appreciated. Anonymous upvotes can only convey so much gratitude)

neural_thing5y ago· 25 in thread

I'm sure his group has done some rigorous research that I can't even understand.

But in my experience, the few-shot learner attribute of GPT-3 makes it insanely useful. We have already found several use cases for it, one of which replaces 2 ML engineers.

Yes, it's not perfect, but it's pretty good at many things, and REALLY easy to use.

sillysaurusx5y ago

And when OpenAI says that your two entirely valid use cases are a safety concern, and denies you api access, what will you do? Better keep those ML engineers handy.

Then one day they simply declined to let them ship. Anything. Anything even resembling “a tool to generate huge quantities of outputs.” Which was, you know, the whole point.

You play, you pay. And I hope you’re ready to pay, because you won’t have your magical genie unless the magical genie’s caretakers believe you are sufficiently worthy.

I cloned as much of OpenAI’s API as I could: https://twitter.com/theshawwn/status/1312299759592333318?s=2...

All that’s left is to reproduce a substantially similar model. Which is doable, but will take time. In the meantime, tread carefully.

turdnagel5y ago

Just so you know, for GPT-3, Microsoft is going to be the exclusive licensee of the API: https://blogs.microsoft.com/blog/2020/09/22/microsoft-teams-...

part1of25y ago

Um, as an outside observer, what is Open about this OpenAI GPT-3 then if they’re selling exclusive rights?

1 more reply

Jack0005y ago

My understanding is that the exclusivity is with regard to the code, the API will still be offered to the public.

1 more reply

kerng5y ago

The code, not api

capableweb5y ago

Can you go into more details where it's useful? As your comment here goes directly against what's argued in the linked Facebook post.

Also, if you've found a use case where GPT-3 replaces real humans, what did those humans actually spend their time on? Seems like either you're over-hyping GPT-3, or under-hyping humanity

neural_thing5y ago

The humans spent their time building a hideously difficult classification model. Out of the box GPT-3 worked better than the result of a year of their work.

Traubenfuchs5y ago

How did they react to this as humans with human pride? Sounds painful.

1 more reply

landryl5y ago

1 more reply

blackbear_5y ago

Just because many more humans spent many more years and many more $$$ building GPT-3 for your convenience.

1 more reply

ApicalDendrite5y ago

Were there any concerns about GPT-3's latency? It looks like it takes a long time for online use cases.

sailingparrot5y ago

So GPT-3 didn't replace your 2 ML engineers, OpenAI did. GPT-3 didn't build itself.

1 more reply

srtjstjsj5y ago

Why? I can replace an excellent furniture designer with a much cheaper off the shelf desk.

fludlight5y ago

“AI” replacing the jobs of AI engineers. But we were told it was only going to do that to blue collar work!

wongarsu5y ago

ardit335y ago

that's not how it works... as it is not a zero sum (i.e. the work is not bounded/fixed but it increases).

Google "Induced Demand"

whymauri5y ago

Because most "AI engineering" has lost its meaning and is actually data analysis.

aaaxyz5y ago

Did it ever have a single meaning? Every company I've been through had a different definition of what "AI engineering" should be

1 more reply

jimbokun5y ago

We need UBI yesterday.

digital-cygnet5y ago

neural_thing5y ago

Extremely complex classification task is all I can say

lavoiems5y ago

GPT-3 is a generative model, isn't it? Can you explain how you converted GPT-3 to a classification model?

2 more replies

ivalm5y ago

1 more reply

nickwalton005y ago

Yep few shot learning is a game changer. You don't get perfect results but you can prototype all kinds of systems extremely fast.

nmfisher5y ago

Were you a beta user, or is this now open for public access?

st1x75y ago· 25 in thread

tiborsaas5y ago

It's so easy to downplay every advanced tech, it's actually fun.

Planes? Just a flying metal tube.

Self landing rockets? Just applied physics.

Smartphones? Just really good fab processes.

The internet? Just a bunch of computers.

CRISPR? Just a molecular scissor.

monktastic15y ago

> You are just a fancy and efficient autocomplete too.

We don't know enough about human cognition to say this.

Scott Aaronson has something interesting to say about this in a conversation with Lex Fridman, actually: https://youtu.be/G_-BBniFFCM?t=419

Quick copy and paste of part of the transcript:

Barrin925y ago

>You are just a fancy and efficient autocomplete too. When you speak or write, some words have a higher probability than others

When you talk you don't do random statistical inference, you match language to the semantics you want to express, which is not statistical.

st1x75y ago

> GPT-3 is a really impressive milestone towards AGI

We really don't know this. It's a big step for the field of language models, that's for sure. But we're so far from AGI that nobody knows which direction it's in and whether it exists at all.

moultano5y ago

> it's a language model designed to continue a sequence of words.

1 more reply

ralfn5y ago

>It has no understanding of the world

What does this mean? The world with countries, borders, celebrities, hypes, newsletters and ideologies?

If something can only live in a reality of data points on the internet this thing would live mostly in the same reality as we do.

xiphias25y ago

AI / ML? Just linear algebra... I heared that one so many times

vladf5y ago

I happen to use slightly less fancy and expensive GPT-2 based autocomplete, and it's amazing. https://tabnine.com

tmalsburg25y ago

surething1235y ago

0-_-05y ago

I use it too and it hasn't, you still need to look at the suggestions before you accept them. It's not that good, so I won't lose my job just yet.

vladf5y ago

No, it's not auto-completing function blocks, just simple expressions that are easy to validate. E.g.,

    let lo = 0;
    let hi = vec.len();
    let mid = lo + (hi