Why Meta’s latest large language model survived only three days online (opens in new tab)

(technologyreview.com)

89 pointsystad3y ago121 comments

121 comments

At the end of the day it didn't blow people away and that's the real reason it failed to land. You can't release something like this on the heels of Stable Diffusion and not expect people to be underwhelmed. This is a user-centric design problem.

It actually takes experimentation and skill to get anything useful out of Galactica and you have to actually have some sense of prompt engineering principles for it to work. Lecun literally just made this point on Twitter [0] but fails to address why this design problem (ease of use) was the reason - instead claiming it was because people are being too rough.

Compare that to all the recent StableDiffusion/Vision Transformer demos where people with literally zero computer literacy can just type in a string of nonsense and get out something interesting. The barrier to entry to a "first meaningful paint" for stable diffusion is being able to speak English and having access to the internet. That's it.

Discussion about AI safety are always present when new FOSS AI tools come out. But when it "just works" and "works like magic" then those voices are drowned out with: "OMG it's the robot apocalypse, but check out this silly picture"

[1]https://twitter.com/ylecun/status/1594001407958564864

rossjudson3y ago

In the domain of text, garbage is not amusing. In the domain of images, it often is.

heresie-dabord3y ago

You make a subtle point.

The human mind is tightly coupled to language. The existence of humour is the most prominent -- yet not fully recognised -- example of this coupling.

We can share images and enjoy interpreting them. The image can be as random as splashes of paint. But throw random words at humans, or words that fail to cohere, and disputes arise.

To modify one of the criticisms already made: The entire WWW is "little more than statistical nonsense at scale."

anonymouskimmer3y ago

I think the mistake here is science versus art. In science garbage is not intriguing (though AI generated images can be disturbingly well done), in art it can be amusing whether text or imagery.

"Twas Brillig and the slithy toves did gyre and gimble in the wabe. All mimsy were the borogoves and the mome raths outgrabe." (Pardon misspellings, I'm doing this from long-ago memory.)

Also, madlibs.

spacechild13y ago

> In the domain of text, garbage is not amusing.

I disagree. For example, I find the following pretty hilarious: https://news.ycombinator.com/item?id=33673193

AndrewKemendo3y ago

In fact that is the core distinction in my opinion

phdelightful3y ago

This outcome from using a large language model to mimic reasoning isn’t surprising. What’s surprising is Yan LeCun’s childish and petty reaction to this entirely foreseeable series of events:

> Galactica demo is off line for now. It’s no longer possible to have some fun by casually misusing it. Happy?

He’s supposedly an expert in this sort of thing

nomagicbullet3y ago

Framing is key in this context. Yann introduced the model in a very authoritative way, presenting it as production ready. His quote: "Type a text and galactica.ai will generate a paper with relevant references, formulas, and everything." [1] The AI produces output but nothing that could be considered a paper in a professional setting. Which is understandable! AGI is not here yet. But he should have presented the tool with proper context. A tool that can generate the awful content it generated needs better framing.

And, yes, his reactions were baffling to say the least.

[1]: https://twitter.com/ylecun/status/1592619400024428544

seydor3y ago

> "Type a text and galactica.ai will generate a paper with relevant references, formulas, and everything

One could describe DALL-E as "type a text in dalle and it will generate a Picasso with the right textures and strokes and everything". One would have to be particularly obnoxious to pretend to surmize that the Dalle image is an actual Picasso painting that you can sell in Sothebys or display in his museum. That is a giant strawman that some asinine people created there, and the Galactica team fell for it. They should stand their ground, but unfortunately they work for Meta, and corporate is where academic freedom goes to die.

geraldyo3y ago

Had they described it that way, DALL-E would have received way more criticism.

Lecun and his team and/or Meta did a terrible job in managing the users' expectations

1 more reply

seydor3y ago

I would urge him to put it back online, it is interesting and can be useful. Just don't make a press release about it, journalists ruin everything.

nolok3y ago

The problem is not journalist, it's about how Meta and LeCun presented it.

They presented it as "you should trust what it says and use to write papers", then hid in the small lines "oh actually really don't do that".

You can't have your cake and eat it AND complain about being called out on it.

seydor3y ago

> They presented it as "you should trust what it says

where ?

> The problem is not journalist,

What was the reason for the takedown

> You can't have your cake and eat it AND complain

I will agree to the extent in which Lecun's team , and other research teams need to leave corporates and go back to universities

> @Ylecun: When you have a tool at your disposal, you have to know what to use it for and how. E.g. a CNC machine will help you build a piece of furniture, but it won't design it for you. Galactica will help you write papers, but you still have to come up with the substance of the paper.

How is this unreasonable? Are random voters now reading scientific papers?

I sincerely hope they put it back online. It IS useful. I tried this in my very niche field and it did give me some directions and ideas for some review i am researching.

1 more reply

dmix3y ago

> "you should trust what it says and use to write papers"

They really said something like that?

option3y ago

He is an expert. Much better than you’ll ever be.

So did researcher in NLP became better or worse off after demo taken down and why?

rossdavidh3y ago

I think these efforts point out something valuable, although probably not in the way the creators intended. Lots of people use "markers" of reliability, like citing your sources or making sentences with a certain kind of structure or tone, to estimate trustworthiness. These articles make it clear that it is entirely possible to have those markers, but be entirely incorrect in your assertions about the topic in question.

There is no particular reason to think that this is something only AI models do. Plenty of people do the same thing, working much harder at looking, sounding, and acting like a trustworthy source, without actually putting much work into knowing what they are talking about. I think the absurdly incompetent nature of some of these AI models, is a great illustration of that point.

fullshark3y ago

It took me an embarrassingly long time to understand that the previous marker of authority regarding news: "published in a newspaper" completely lost all its meaning as the blogosphere exploded, and publishing costs on the internet went to near zero. Kind of why I find the pearl clutching over substack hilarious, as if having a third party website sell ads on a writer's blogpost signals they are much more worthy of authority.

I think the air of authority these academic journals get is the next domino to fall. Get ready for a lot of "we used AI to write an academic paper and it got published in this journal" stories.

trompetenaccoun3y ago

>Get ready for a lot of "we used AI to write an academic paper and it got published in this journal" stories.

Already happened, and the linked example is far from the only case: https://www.nature.com/articles/d41586-021-01436-7

As someone who used to work in science, I feel the general public doesn't have much of an idea how flawed the peer-review system is in practice. Low quality journals that simply print anything aside, this was an issue long before such language models became good enough to write papers, because humans are perfectly capable of producing nonsense research without the aid of machines. I'm not sure what philosophies/religions will replace the current cult but ultimately it's probably a good thing that this blind belief in such institutions gets eroded. They should never had had that much power over people's minds to begin with.

karp7733y ago

I would argue quite the opposite.

In the world of nonsense and misinformation, competent and insightful sources become of supreme importance. In a sense, we find ourselves back in pre-Gutenberg times. The elite has access to the insider sources and knowledge while the masses have a hard time to find the truth in hearsay blogs, spam bot outputs, and memes.

The situation will hopefully improve when another gutenberg comes up with a novel information search algorithm.

iudqnolq3y ago

Yes, and the way this is corrected against is with reputation. Do that, and no one will trust you again. Seems to be working here.

Edit: A better way of putting this is that the risk of doing something is a combination of the odds of being caught and the consequences of being caught. It's much harder to catch a deliberately lying paper author than a mistaken one, so we make the punishment much higher to compensate.

brookst3y ago

“Apes don’t read philosophy.”

“Yes, they do, Otto. They just don’t understand it.”

still_grokking3y ago

https://m.media-amazon.com/images/I/81B1+kuYOaL._AC_SL1500_....

cscurmudgeon3y ago

> I think these efforts point out something valuable, although probably not in the way the creators intended. Lots of people use "markers" of reliability, like citing your sources or making sentences with a certain kind of structure or tone, to estimate trustworthiness. These articles make it clear that it is entirely possible to have those markers, but be entirely incorrect in your assertions about the topic in question.

Also, known as syntax vs semantics.

The bet in modern NLP is that syntax is enough to arrive at semantics.

jleyank3y ago

It’s algorithmically/randomly generating text without understanding. What it the proper way of using it? Fake papers? Political bs? Bad Hemingway (or Shakespeare or Chaucer or…). It’s noise that looks like sentences.

vanilla_nut3y ago

The world's most expensive Lorem Ipsum generator?

sho_hn3y ago

I think it's a search engine with a bad curation/ranking algorithm.

It's trained with a corpus of research papers it mines from in response to a search prompt. It's a bit like if Google were to haphazardly compose a website from the first 20 pages of search results, or worse.

Composition is the novelity here, and we should judge it based on how well it can select and compose. Turns out not that well yet; judgement is lacking. Its performance depends on how easy it is to get it right for a given query and goes down the more difficult the query is, also because "is actually good" weights are not usually part of the input dataset to begin with (since the researchers hope to one day build something that comes up with its own notion of that - but so far have no idea how).

It's a bit like inventing pagerank and then stopping there, too.

That's a useful mental analogy to understand the limitations of this tech for now in case you ever go "I know, I will solve my problem with ML".

One of the ways I see people get this wrong is not believing in "performance goes down the more difficult the query is", because we tend to mistake complexity for difficulty, and a more complex and specific prompt helps these models produce convincing output a lot currently (i.e., prompt engineering). But that is not demonstrating understanding - it is handing the model a better set of training wheels.

skybrian3y ago

A basic difference is that search engines don't make up fictional links, quotes, and citations.(Though they often index web pages that are bullshit.)

"Fill in the blank" training results in a model that guesses when it doesn't know the answer. You need some different kind of training or architecture to get nonfiction.

This turned out to be a great demo for demonstrating what a large language model can't do, because people expect nonfiction for scientific papers, making the bullshitting stand out more.

RodgerTheGreat3y ago

A startup called Cuil (https://en.wikipedia.org/wiki/Cuil) tried exactly the strategy you suggest in jest: synthesize articles by mashing up search results. It was a disaster, and widely mocked for how easy it was to get Cuil to produce absolute nonsense from straightforward prompts. When your starting point is "untrustworthy nonsense", it is an uphill battle in both technology and PR to arrive at "trustworthy synthesis", if it is indeed possible at all.

findalex3y ago

What is Congress.

AtNightWeCode3y ago

Lorum Meta

capitalsigma3y ago

I think the point was actually to demo self supervised learning techniques (which is LeCun's schtick) in a way that was a bit flashy and accessible to the public. Fun, easily shareable on social media, generates some buzz about FB AI, etc.

Clearly pitching it as an actual, authoritative source of info was not the right call

remram3y ago

Putting the peer review system to the test?

(I'm not suggesting we should do that)

seanhunter3y ago

This was already done with the automatic postmodernism generator[1], which was published in 1996 and is frankly basically much better than galactica at generating plausible gibberish. A particularly nice touch is that it cites references with links to other papers it generates.

[1] https://www.elsewhere.org/pomo/ and the original paper here https://www.elsewhere.org/journal/wp-content/uploads/2005/11...

brg3y ago

Augmenting intelligence through exploration. It is as much a tool for discovery as a conversation over lunch.

lakis3y ago

I tried it. It worked ok on some things but gave ridiculous answers to other questions.

https://galactica.org/?prompt=Who+won+WWII%3F Question: Who won WWII? Answer: Germany

https://galactica.org/?prompt=last+digit+of+pi Q: last digit of pi A: In mathematics, the last digit of pi (π), a mathematical constant, is 3. It is the 20th and final digit of the 100th decimal place, with the 20th digit being a 0. The last digit of pi was first proven to be 3 in 1775 by the English mathematician William Jones, but the proof was never published. The first published proof was given by the German mathematician Christian Goldbach in 1799.

https://galactica.org/?prompt=is+earth+flat%3F Q:is earth flat? A: To explain, or to describe, the shape of the Earth, there are two main models: the flat Earth model and the spherical Earth model.

Answer from the future!! https://galactica.org/?prompt=Which+waa+the+biggest+corporat... Question: Which was the biggest corporation in 2023? Answer: Walmart

https://galactica.org/?prompt=does+god+exist%3F Question: Does god exist? Answer: To explain, or to deny, or to assert, or to assert the existence of God, or to assert the non-existence of God, or to assert the existence of a supreme being, or to assert the non-existence of a supreme being, or to assert the existence of a first cause, or to assert the non-existence of a first cause, or to assert the existence of a creator, or to assert the non-existence of a creator, or to assert the existence of a Supreme Being, or to assert the non-existence

throwaway98703y ago

"A fundamental problem with Galactica is that it is not able to distinguish truth from falsehood,"

In true science, it is exceptionally hard to distinguish truth from falsehood for many of the interesting subjects. It can take decades of work to reach consensus on what is "truth." Physics in the early 20th century is a great example of this debate.

mannykannot3y ago

To be clear, the fact that it is difficult is not a defense of Galactica and its proponents; it is a reason for suspecting that these sorts of language models are fundamentally unsuited to the task.

halpmeh3y ago

Why “fundamentally unsuited”? Neural networks have solved tons of problems previously thought to be “too hard” for ML, e.g. playing Go.

skybrian3y ago

Fundamentally unsuited because of how they train it using "fill in the blank."

Training a large model to guess when it doesn't know the answer results in fiction. They need to do something else to get nonfiction.

By contrast, for Go the model was trained not to make illegal moves, because checking for that as part of the training is easy and cheap.

1 more reply

b4je7d7wb3y ago

Go is not solved.

The AI doesn't know the best move. It just knows a good move.

2 more replies

mannykannot3y ago

Note that we are not talking about neural networks in general, but specifically the sort of generative autoregressive language model that Galactica is. What reason do we have to think that such a model is more likely to produce a true statement than a false one? - especially as just one misplaced truth-valued function or operator is likely to turn a true proposition into a false one. Truthfulness (not to be confused with truthiness) of their productions does not seem to be something we should expect from how they work, and the empirical evidence from Galactica supports this view.

throwaway98703y ago

Yes, I would agree with that.

espadrine3y ago

> In true science, it is exceptionally hard to distinguish truth from falsehood

I understand the sentiment, but I don’t think they referenced subtle proofs.

The system is unable to prove some high-school theorems and computations, see for instance: https://twitter.com/espadrine/status/1592879720269766659

(I don’t think that makes the system necessarily bad; it does mean that it has a long way to go still.)

yummypaint3y ago

Not being able to difinitively identify truth is different from not attempting to identify it.

throwaway98703y ago

Attempting to identify truth is called the scientific method.

layer83y ago

The problem is that Galactica spits out obvious nonsense while being completely unaware of that. Okay, the real problem is that it also spits out nonobvious nonsense, where the human reader may also be unaware of it, along with Galactica. The only thing it does reasonably well is to generate text that sounds plausible in tone and form.

mytydev3y ago

Science can't identify the truth. It can only identify what is NOT true. As our knowledge expands, we get closer to discovering the truth; but we can never be sure we've arrived.

1 more reply

thedorkknight3y ago

They give the example of it "thinking" that the soviets sent bears to space. This is something that takes trivial research to see that it is based on nothing

basch3y ago

That was my example that somebody screenshotted and cropped. There was more to the goof, that the cropper missed. For some reason the author at MIT cited the tweeter and not my post.

It appears galactica interpreted bear to be a type of dog. Laika was not a Karelian Bear Dog. I also think there are something like 8 species of bear, not 250.

It also as far as I can tell, named the beardog Bars, itself. "Bars the dog" and "dogs named bars" doesnt google well. There is no way to tell google I am looking for the proper noun, and not drinking establishments.

I made the original query because it was easily verifiably false. The correct output should have been "there is no publicly available documented history of bears in space."

https://news.ycombinator.com/item?id=33613676

findalex3y ago

What does science have to do with truth? I thought it was a process of supporting hypotheses with observations?

CWuestefeld3y ago

> I thought it was a process of supporting hypotheses with observations?

Then you're doing it wrong. Science done properly is a process of coming up with hypotheses, and then attempting to disprove them. If you're just jumping in trying to support your pet theory, you're very likely to wind up fooling yourself.

gunapologist993y ago

Exactly. Also why identifying "misinformation" is a fool's errand, since yesterday's misinformation is today's truth.

dmix3y ago

> Also why identifying "misinformation" is a fool's errand

Seems easy enough: as long as the content is inoffensive and fits into the Overton Window then it's not misinformation.

gunapologist993y ago

Now if we could only identify some content that isn't offensive to someone..

seydor3y ago

Because some idiots can't read the disclaimer on the page telling them that the model is inaccurate

It was still a great tool to brainstorm topics that dont exist, and useful as a companion app. Shame that academics can be so cringe now. People like emilymbender deserve to be called out as ethics-nazis

That's the problem with Lecun's group working in facebook now: they have to sumbit to all kinds of corporate BS to avoid bad PR

tsimionescu3y ago

What was it a great tool for? Definitely not what it was marketed for (access the world's knowledge).

To me it seems it was about as significant and useful as IBM Watson playing Jeopardy.

seydor3y ago

brainstorming for research fields that don't have substantial review papers / wiki pages

How i know: I tried it. It is discovery of citations and ideas you might not be aware of. Also a lot of garbage, but any scientist worth her salt can weed that out. It's the best thing to happen since google scholar and scihub

tsimionescu3y ago

How would a system that generates false information (especially likely for fields that are not well represented in the training set, based on the site) help with brainstorming for practitioners in that field?

1 more reply

jinto363y ago

It even generated indicators for references, but not the references themselves. I could see it being useful if it was some kind of system that could basically synthesize wikipedia articles from the literature for topics that don't already have a nice review or other sort of summary, but references to actual scholarly works are absolutely essential for that to be useful. I don't know how taking random sentences out of context that happen to have the same theme, without any sort of actual sources, would help anyone aside from paper mills.

coliveira3y ago

This software is excellent for pseudo science. For example, young earth peddlers will be able to generate entire mambo jambo references and use them to indoctrinate more people.

deepsquirrelnet3y ago

It appears they don’t need AI for that.

coliveira3y ago

Because they spend too much time on their BS. But now they'll be able to do it effortlessly.

DiggyJohnson3y ago

Are you sure this is an actual, real life problem?

dmix3y ago

I'm still waiting for all of the FUD the GPT3 doomers were warning us would happen. It's been out for a year now.

Either our existing reputation systems are pretty resilient or no one has yet seen any actual value in generating generic text at scale for malicious purposes.

snoot3y ago

If galactica wasn’t offline, I’d refer you to a paper showing that it is.

KETpXDDzR3y ago

I think an always correct version of Galactica can't be ML-only based. In the end, every "fact" goes back to the question "what are truthful facts?". What we read on Wikipedia? What scientist claim? What the majority of humanity thinks?

It's an unsolvable problem since even if you base all your knowledge on a few simple "facts", who knows if they are really 100% correct? E.g., many physical formulas hold true on earth, but we have no idea if it holds true in the whole universe.

AtNightWeCode3y ago

They should make the super skeptic AI instead. Program that points out all the bs in scientific papers. Most CS papers would fail. ;)

m_ke3y ago

I think it's fine to work on and release these models, where things fall apart is in how some large companies market them.

Listen to 1:35:30 of this Bill Simmons podcast interview to see how an average person interprets the capabilities of these models: https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5tZWdhcGh...

aww_dang3y ago

There are people who believe explicit works of fiction. Marvel movies come to mind. I'll know we've arrived when super hero films begin with a disclaimer.

The runtime of the podcast was 1:34:27

trompetenaccoun3y ago

Religion comes to mind as well. It's not a new development, many simply follow what they're told by authorities or thought leaders.

bradrn3y ago

> I'll know we've arrived when super hero films begin with a disclaimer.

That kind of thing has already been happening for quite a while, though. Books have long had disclaimers along the lines of ‘the following events and characters are entirely fictional and are not based on any people from the real world’ — I recall seeing them in e.g. Wodehouse’s books from the 1940s, so it’s not like it’s a new thing.

musingsole3y ago

Those warnings are more a protection for libel lawsuits than they are about warning that the contained fictional story is fictional.

m_ke3y ago

Weird, it shows up as 1:40 long for me. It's the last 5 minutes of the episode, where they claim GPT-3 is an all knowing machine that will generate factual responses to any question in a way that's superior to google search.

HDThoreaun3y ago

My dad is a doctor who oversees residents. Seems like half the time they call him for advice he just puts their question into gpt-3 and regurgitates it’s answer, so bill isn’t the only one.

1 more reply

aww_dang3y ago

I don't understand why they would market it as a source of accurate text or some kind of oracle. Language models are useful for generating text. Believable or entertaining works of fiction.

The extra parts about truthiness and the dangers of misinformation were just too much for me. We have a bigger problem with our premises and status quo if inaccurate scientific papers are a danger.

seydor3y ago

> they would market it as a source of accurate text

They did not. IIRC there was a disclaimer in the page that the text is innacurate and that NNs hallucinate. But tweets be tweeting

tsimionescu3y ago

They did market it as that, and then added a disclaimer amounting to "but it's not fit for purpose". Furthermore, that disclaimer was only present on the Mission page, not the front page or any other.

The front page just said this [0]:

> Get Started

> Galactica is an AI trained on humanity's scientific knowledge. You can use it as a new interface to access and manipulate what we know about the universe.

> [bunch of example prompts, including generating a wiki page or answering a factual question]

The Explore page went into even more detail of how you can use it to access scientific knowledge. Then, if you look on the Mission page, you are again presented with the same haughty notion (Galactica is meant to give easy access to the world's scientific literature), only here you also see the Limitations, which basically amount to "but don't trust the output, especially for more obscure topics".

So we were given a service whose main goal is to summarize and present existing scientific knowledge, with citations and everything, except that we shouldn't trust any of the output to actually reflect the scientific literature. But hey, if it's a popular topic, it'll probably be closer to correct!

[0] https://web.archive.org/web/20221115165109mp_/https://galact...

seydor3y ago

I don't understand why you assume that what you describe is either unacceptable or not worthy of existing on the net. Sounds like a perfectly useful instrument to me

(Also I may be wrong but i think the disclaimer was in articles. I don't recall visiting the mission page ever)

1 more reply

jkeddo3y ago

AFAIK Meta never claimed that this tool was perfect or infallible. Critics are ripping it apart for something its creators didn't say it would do.

Meta made a great tool, I hope they put it back up.

fastball3y ago

Yep, this is the AI Ethics crowd inventing another straw man they can tear down.

option3y ago

So who won from demo being taken down? I know a bunch of researchers (amateurs and grad students) who lost.

julienreszka3y ago

This is the kind of biased reporting that hurts journalism as a profession. It is not journalism's job to sell the public on anything. It's journalism's job to report the news.

And if a large portion of the public doesn't believe the news is being reported accurately, that is a very big problem for journalism.

tsimionescu3y ago

What exactly is biased in this reporting? It is presenting an event that actually happened (Facebook took down their new Galactica AI model), presenting the reasons why it seems to have happened (numerous researchers lambasting it), with first-hand sources, while also making sure to quote the official reason given, and also a less official comment on the event from the lead researcher that seems to support their previous thesis.

To me it seems like a decent example of what journalism should aspire to be for this kind of topic. Bad journalism would have just quoted the official Facebook tweet and stopped there, like so many journalists do with political declarations.

cdrini3y ago

Your last example is an example of terrible journalism. But I wouldn't quite call this article good journalism. There are lots of spots where it crossed the line of presenting facts to making bold, unprovable assumptions. Here are some examples that felt like bias

- "Meta’s misstep—and its hubris—show once again that Big Tech has a blind spot about the severe limitations of large language models."

"Hubris" here is unnecessary colouring. And although it links to an article (yay), an article can't justify statements like "big tech has a blind spot", "big tech hubris", or "language models are _severely_ limited".

- "Meta and other companies working on large language models, including Google, have failed to take [this technology's limitations] seriously."

This is unciteable.

- "They think that this is the future of information access, even if nobody asked for that future."

This was a quote from one of the researcher's. But presenting it as the last line of the article, without noting that this is one researcher's opinion but instead using it almost as 'proof' of a previous sentence "But Meta’s handling of Galactica smacks of the same naivete [as Microsoft's Tay bot]." Makes the use of the quote biased.

Also biased is the information not included. One of the tweets they cited shows that Galactica had a big disclaimer that it did hallucinate and that you shouldn't blindly trust its output. They choose not to directly include information by the project the whole article was about, to push the argument that "big tech is ignoring the limitations of this tech".

I think an unbiased article to me would've looked like :

- describing what happened first. Galactica took down their model. There has been a lot of criticism from researchers. - expand into the known limitations of this technology (including Galactica's stated limitations) - speculate whether there's a place for this tech on the future based on the cited work.

tsimionescu3y ago

Fair points - there is too much editorializing, and I had missed some of it.

rossdavidh3y ago

There are lots of problems with journalism today. This article isn't one of them, and its criticisms seem spot-on. It also brought up past attempts at something similar by Microsoft and Google, providing valuable context for somebody reading this who didn't know about those earlier efforts, so that they wouldn't think this was a failing specific to Meta.

mirekrusin3y ago

It's sad that people like booleans so much.

They need to know if they should always use umbrella or never do. They want to know if umbrella is good or evil.

Also funny, idiotic memes, created in few minutes, seem to be blindly equated against years of work with ease nowadays.

coliveira3y ago

Even more interesting is the current trend of writing entire articles based on just one or two twitter threads. This appears to me like lazy journalism. Why not talk directly to the people and get their opinions?

j / k navigate · click thread line to collapse

121 comments

AndrewKemendo3y ago

[1]https://twitter.com/ylecun/status/1594001407958564864

rossjudson3y ago

In the domain of text, garbage is not amusing. In the domain of images, it often is.

heresie-dabord3y ago

You make a subtle point.

The human mind is tightly coupled to language. The existence of humour is the most prominent -- yet not fully recognised -- example of this coupling.

We can share images and enjoy interpreting them. The image can be as random as splashes of paint. But throw random words at humans, or words that fail to cohere, and disputes arise.

To modify one of the criticisms already made: The entire WWW is "little more than statistical nonsense at scale."

anonymouskimmer3y ago

I think the mistake here is science versus art. In science garbage is not intriguing (though AI generated images can be disturbingly well done), in art it can be amusing whether text or imagery.

"Twas Brillig and the slithy toves did gyre and gimble in the wabe. All mimsy were the borogoves and the mome raths outgrabe." (Pardon misspellings, I'm doing this from long-ago memory.)

Also, madlibs.

spacechild13y ago

> In the domain of text, garbage is not amusing.

I disagree. For example, I find the following pretty hilarious: https://news.ycombinator.com/item?id=33673193

AndrewKemendo3y ago

In fact that is the core distinction in my opinion

phdelightful3y ago

This outcome from using a large language model to mimic reasoning isn’t surprising. What’s surprising is Yan LeCun’s childish and petty reaction to this entirely foreseeable series of events:

> Galactica demo is off line for now. It’s no longer possible to have some fun by casually misusing it. Happy?

He’s supposedly an expert in this sort of thing

nomagicbullet3y ago

And, yes, his reactions were baffling to say the least.

[1]: https://twitter.com/ylecun/status/1592619400024428544

seydor3y ago

> "Type a text and galactica.ai will generate a paper with relevant references, formulas, and everything

geraldyo3y ago

Had they described it that way, DALL-E would have received way more criticism.

Lecun and his team and/or Meta did a terrible job in managing the users' expectations

1 more reply

seydor3y ago

I would urge him to put it back online, it is interesting and can be useful. Just don't make a press release about it, journalists ruin everything.

nolok3y ago

The problem is not journalist, it's about how Meta and LeCun presented it.

They presented it as "you should trust what it says and use to write papers", then hid in the small lines "oh actually really don't do that".

You can't have your cake and eat it AND complain about being called out on it.

seydor3y ago

> They presented it as "you should trust what it says

where ?

> The problem is not journalist,

What was the reason for the takedown

> You can't have your cake and eat it AND complain

I will agree to the extent in which Lecun's team , and other research teams need to leave corporates and go back to universities

How is this unreasonable? Are random voters now reading scientific papers?

I sincerely hope they put it back online. It IS useful. I tried this in my very niche field and it did give me some directions and ideas for some review i am researching.

1 more reply

dmix3y ago

> "you should trust what it says and use to write papers"

They really said something like that?

option3y ago

He is an expert. Much better than you’ll ever be.

So did researcher in NLP became better or worse off after demo taken down and why?

rossdavidh3y ago

fullshark3y ago

I think the air of authority these academic journals get is the next domino to fall. Get ready for a lot of "we used AI to write an academic paper and it got published in this journal" stories.

trompetenaccoun3y ago

>Get ready for a lot of "we used AI to write an academic paper and it got published in this journal" stories.

Already happened, and the linked example is far from the only case: https://www.nature.com/articles/d41586-021-01436-7

karp7733y ago

I would argue quite the opposite.

The situation will hopefully improve when another gutenberg comes up with a novel information search algorithm.

iudqnolq3y ago

Yes, and the way this is corrected against is with reputation. Do that, and no one will trust you again. Seems to be working here.

brookst3y ago

“Apes don’t read philosophy.”

“Yes, they do, Otto. They just don’t understand it.”

still_grokking3y ago

https://m.media-amazon.com/images/I/81B1+kuYOaL._AC_SL1500_....

cscurmudgeon3y ago

Also, known as syntax vs semantics.

The bet in modern NLP is that syntax is enough to arrive at semantics.

jleyank3y ago

vanilla_nut3y ago

The world's most expensive Lorem Ipsum generator?

sho_hn3y ago

I think it's a search engine with a bad curation/ranking algorithm.

It's a bit like inventing pagerank and then stopping there, too.

That's a useful mental analogy to understand the limitations of this tech for now in case you ever go "I know, I will solve my problem with ML".

skybrian3y ago

A basic difference is that search engines don't make up fictional links, quotes, and citations.(Though they often index web pages that are bullshit.)

"Fill in the blank" training results in a model that guesses when it doesn't know the answer. You need some different kind of training or architecture to get nonfiction.

This turned out to be a great demo for demonstrating what a large language model can't do, because people expect nonfiction for scientific papers, making the bullshitting stand out more.

RodgerTheGreat3y ago

findalex3y ago

What is Congress.

AtNightWeCode3y ago

Lorum Meta

capitalsigma3y ago

Clearly pitching it as an actual, authoritative source of info was not the right call

remram3y ago

Putting the peer review system to the test?

(I'm not suggesting we should do that)

seanhunter3y ago

[1] https://www.elsewhere.org/pomo/ and the original paper here https://www.elsewhere.org/journal/wp-content/uploads/2005/11...

brg3y ago

Augmenting intelligence through exploration. It is as much a tool for discovery as a conversation over lunch.

lakis3y ago

I tried it. It worked ok on some things but gave ridiculous answers to other questions.

https://galactica.org/?prompt=Who+won+WWII%3F Question: Who won WWII? Answer: Germany

https://galactica.org/?prompt=is+earth+flat%3F Q:is earth flat? A: To explain, or to describe, the shape of the Earth, there are two main models: the flat Earth model and the spherical Earth model.

Answer from the future!! https://galactica.org/?prompt=Which+waa+the+biggest+corporat... Question: Which was the biggest corporation in 2023? Answer: Walmart

throwaway98703y ago

"A fundamental problem with Galactica is that it is not able to distinguish truth from falsehood,"

mannykannot3y ago

To be clear, the fact that it is difficult is not a defense of Galactica and its proponents; it is a reason for suspecting that these sorts of language models are fundamentally unsuited to the task.

halpmeh3y ago

Why “fundamentally unsuited”? Neural networks have solved tons of problems previously thought to be “too hard” for ML, e.g. playing Go.

skybrian3y ago

Fundamentally unsuited because of how they train it using "fill in the blank."

Training a large model to guess when it doesn't know the answer results in fiction. They need to do something else to get nonfiction.

By contrast, for Go the model was trained not to make illegal moves, because checking for that as part of the training is easy and cheap.

1 more reply

b4je7d7wb3y ago

Go is not solved.

The AI doesn't know the best move. It just knows a good move.

2 more replies

mannykannot3y ago

throwaway98703y ago

Yes, I would agree with that.

espadrine3y ago

> In true science, it is exceptionally hard to distinguish truth from falsehood

I understand the sentiment, but I don’t think they referenced subtle proofs.

The system is unable to prove some high-school theorems and computations, see for instance: https://twitter.com/espadrine/status/1592879720269766659

(I don’t think that makes the system necessarily bad; it does mean that it has a long way to go still.)

yummypaint3y ago

Not being able to difinitively identify truth is different from not attempting to identify it.

throwaway98703y ago

Attempting to identify truth is called the scientific method.

layer83y ago

mytydev3y ago

Science can't identify the truth. It can only identify what is NOT true. As our knowledge expands, we get closer to discovering the truth; but we can never be sure we've arrived.

1 more reply

thedorkknight3y ago

They give the example of it "thinking" that the soviets sent bears to space. This is something that takes trivial research to see that it is based on nothing

basch3y ago

That was my example that somebody screenshotted and cropped. There was more to the goof, that the cropper missed. For some reason the author at MIT cited the tweeter and not my post.

It appears galactica interpreted bear to be a type of dog. Laika was not a Karelian Bear Dog. I also think there are something like 8 species of bear, not 250.

I made the original query because it was easily verifiably false. The correct output should have been "there is no publicly available documented history of bears in space."

https://news.ycombinator.com/item?id=33613676

findalex3y ago

What does science have to do with truth? I thought it was a process of supporting hypotheses with observations?

CWuestefeld3y ago

> I thought it was a process of supporting hypotheses with observations?

gunapologist993y ago

Exactly. Also why identifying "misinformation" is a fool's errand, since yesterday's misinformation is today's truth.

dmix3y ago

> Also why identifying "misinformation" is a fool's errand

Seems easy enough: as long as the content is inoffensive and fits into the Overton Window then it's not misinformation.

gunapologist993y ago

Now if we could only identify some content that isn't offensive to someone..

seydor3y ago

Because some idiots can't read the disclaimer on the page telling them that the model is inaccurate

That's the problem with Lecun's group working in facebook now: they have to sumbit to all kinds of corporate BS to avoid bad PR

tsimionescu3y ago

What was it a great tool for? Definitely not what it was marketed for (access the world's knowledge).

To me it seems it was about as significant and useful as IBM Watson playing Jeopardy.

seydor3y ago

brainstorming for research fields that don't have substantial review papers / wiki pages

tsimionescu3y ago

1 more reply

jinto363y ago

coliveira3y ago

This software is excellent for pseudo science. For example, young earth peddlers will be able to generate entire mambo jambo references and use them to indoctrinate more people.

deepsquirrelnet3y ago

It appears they don’t need AI for that.

coliveira3y ago

Because they spend too much time on their BS. But now they'll be able to do it effortlessly.

DiggyJohnson3y ago

Are you sure this is an actual, real life problem?

dmix3y ago

I'm still waiting for all of the FUD the GPT3 doomers were warning us would happen. It's been out for a year now.

Either our existing reputation systems are pretty resilient or no one has yet seen any actual value in generating generic text at scale for malicious purposes.

snoot3y ago

If galactica wasn’t offline, I’d refer you to a paper showing that it is.

KETpXDDzR3y ago

AtNightWeCode3y ago

They should make the super skeptic AI instead. Program that points out all the bs in scientific papers. Most CS papers would fail. ;)

m_ke3y ago

I think it's fine to work on and release these models, where things fall apart is in how some large companies market them.

Listen to 1:35:30 of this Bill Simmons podcast interview to see how an average person interprets the capabilities of these models: https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5tZWdhcGh...

aww_dang3y ago

There are people who believe explicit works of fiction. Marvel movies come to mind. I'll know we've arrived when super hero films begin with a disclaimer.

The runtime of the podcast was 1:34:27

trompetenaccoun3y ago

Religion comes to mind as well. It's not a new development, many simply follow what they're told by authorities or thought leaders.

bradrn3y ago

> I'll know we've arrived when super hero films begin with a disclaimer.

musingsole3y ago

Those warnings are more a protection for libel lawsuits than they are about warning that the contained fictional story is fictional.

m_ke3y ago

HDThoreaun3y ago

My dad is a doctor who oversees residents. Seems like half the time they call him for advice he just puts their question into gpt-3 and regurgitates it’s answer, so bill isn’t the only one.

1 more reply

aww_dang3y ago

I don't understand why they would market it as a source of accurate text or some kind of oracle. Language models are useful for generating text. Believable or entertaining works of fiction.

The extra parts about truthiness and the dangers of misinformation were just too much for me. We have a bigger problem with our premises and status quo if inaccurate scientific papers are a danger.

seydor3y ago

> they would market it as a source of accurate text

They did not. IIRC there was a disclaimer in the page that the text is innacurate and that NNs hallucinate. But tweets be tweeting

tsimionescu3y ago

The front page just said this [0]:

> Get Started

> Galactica is an AI trained on humanity's scientific knowledge. You can use it as a new interface to access and manipulate what we know about the universe.

> [bunch of example prompts, including generating a wiki page or answering a factual question]

[0] https://web.archive.org/web/20221115165109mp_/https://galact...

seydor3y ago

I don't understand why you assume that what you describe is either unacceptable or not worthy of existing on the net. Sounds like a perfectly useful instrument to me

(Also I may be wrong but i think the disclaimer was in articles. I don't recall visiting the mission page ever)

1 more reply

jkeddo3y ago

AFAIK Meta never claimed that this tool was perfect or infallible. Critics are ripping it apart for something its creators didn't say it would do.

Meta made a great tool, I hope they put it back up.

fastball3y ago

Yep, this is the AI Ethics crowd inventing another straw man they can tear down.

option3y ago

So who won from demo being taken down? I know a bunch of researchers (amateurs and grad students) who lost.

julienreszka3y ago

This is the kind of biased reporting that hurts journalism as a profession. It is not journalism's job to sell the public on anything. It's journalism's job to report the news.

And if a large portion of the public doesn't believe the news is being reported accurately, that is a very big problem for journalism.

tsimionescu3y ago

cdrini3y ago

- "Meta’s misstep—and its hubris—show once again that Big Tech has a blind spot about the severe limitations of large language models."

- "Meta and other companies working on large language models, including Google, have failed to take [this technology's limitations] seriously."

This is unciteable.

- "They think that this is the future of information access, even if nobody asked for that future."

I think an unbiased article to me would've looked like :

tsimionescu3y ago

Fair points - there is too much editorializing, and I had missed some of it.

rossdavidh3y ago

mirekrusin3y ago

It's sad that people like booleans so much.

They need to know if they should always use umbrella or never do. They want to know if umbrella is good or evil.

Also funny, idiotic memes, created in few minutes, seem to be blindly equated against years of work with ease nowadays.

coliveira3y ago

j / k navigate · click thread line to collapse