ChatGPT as a Calculator for Words (opens in new tab)

(simonwillison.net)

201 pointsstevefink3y ago79 comments

79 comments

58 comments · 16 top-level

SomewhatLikely3y ago· 15 in thread

The author does a good job of pointing out what may be the strongest skills of LLMs but the claim they aren't useful as a search engine didn't ring particularly true. For many questions I have ChatGPT is the best tool to use because I know the topics I'm asking about are mentioned hundreds of times in the web, and the LLM can distill down at knowledge to the specifics I'm asking about. If you treat it as a friend who has a ton of esoteric knowledge in many areas but is prone to making stuff up to sound like they know what they're talking about you can still get lots of use pulling facts and some basic reasoning out of the models.

dror3y ago

He can make the same criticism of Internet searches as he does of GPT: you shouldn't trust them until you validate them.

I find that GPT's answers are for the most part more reliable the searches, specifically today's searches. In the last 12 months, search results have become so spammy with AI generated pages (oh the irony), that it's hard to find reliable answers.

So like search, I look at GPT's answers with a grain of salt and validate them, but these days I use GPT all day every day and search rarely. To be fair, I use it a lot because I have a GPT CLI that works just the way I want it to, since I wrote it :-). https://github.com/drorm/gish

anotherpaulg3y ago

Gish looks really nice. I'm going to give it a try.

It seems like you've been using similar workflows to what I've been trying for coding with gpt?

https://github.com/paul-gauthier/easy-chat#created-by-chatgp...

2 more replies

akhilpotla3y ago

Speaking of AI generated pages, I wonder how OpenAI filter these low quality web pages out of their training set as they continue to training.

Also, I wonder how they decide what code is worth training on. Because a lot of code is written in poor style/has technical debt, it might be the case that these LLMs in the long run lead to an increase in the technical debt in our society. Plus, eventually, and this might already be happening, the LLM are going to end up training on their own outputs, so that could lead to self immolation by the model. I am not certain RLHF completely resolves this issue.

3 more replies

swader9993y ago

Google and others would be wise to add a date filter of "before summer 2023". Maybe a bit longer, but not much time left till AI spam really takes over.

1 more reply

rimliu3y ago

I imagine ChatGPT will go from the "future changing" to "glorified spam generator" quite quickly.

JieJie3y ago

“I think the right way to think of the models we create is as a reasoning engine, not a fact database. They can also act as a fact database, but that’s not really what is special about them.” —Sam Altman

Rebecca Jarvis interviews Sam Altman for ABC News Rebecca Jarvis, https://www.youtube.com/watch?v=540vzMlf-54

(I don't think this contradicts what you said.)

qwertox3y ago

From his interview with Lex Friedman:

Quoting what he says [0][1]:

> You know, a funny thing about the way we're training these models is I suspect too much of the like processing power for lack of a better word is going into using the models as a database instead of using the model as a reasoning engine. The thing that's really amazing about the system is that it, for some definition of reasoning, and we could of course quibble about it and there's plenty for which definitions this wouldn't be accurate. But for some definition it can do some kind of reasoning. And, you know, maybe like the scholars and the experts and like the armchair quarterbacks on Twitter would say, no, it can't. You're misusing the word, you know, whatever, whatever. But I think most people who have used the system would say, okay, it's doing something in this direction. And I think that's remarkable. And the thing that's most exciting and somehow out of ingesting human knowledge, it's coming up with this reasoning capability. However, we're gonna talk about that. Now, in some senses, I think that will be additive to human wisdom.

[0] https://steno.ai/lex-fridman-podcast-10/367-sam-altman-opena...

[1] https://youtu.be/L_Guz73e6fw?t=828

stavros3y ago

I'm convinced LLMs are amazing for search, because they're the only thing that managed to give me an answer when I was looking for some software I didn't know the general category name for. I described what I wanted the software to do, corrected its understanding with a few followup messages, and it gave me a list of alternatives that are exactly what I wanted, along with a category name.

Google, in comparison, returned absolutely irrelevant SEO spam.

fwlr3y ago

There’s two different things and we refer to both of them with the word “search”.

Sometimes search means “I can sort of describe what I’m looking for, can you tell me what it’s called?”. LLMs excel here. I told GPT4 I’m doing computer animation and want to do smooth blending, it told me that’s called “interpolation”, I asked for some common terms in the literature about this to help me look and it told me about LERP, SLERP, quaternions, splines, Beziers, keyframes, inverse kinematics, and motion capture. All useful jumping-off points. (A subset of this type of search is “I know what this is called, can you tell me more about it?”. This is probably the place where LLMs sell snake oil the most; they always provide a convincing explanation of the thing, but there’s no guarantee on veracity.)

Other times, search means “I have a specific phrase and I want to find occurrences of it”. LLMs aren’t just bad at this, they are constitutionally incapable of it. The way you build an LLM necessarily involves taking all specific phrases and occurrences thereof, and blending them up into a word slurry that is then condensed and abstracted into floating point weights. It no longer has the specifics to give you. It’s a shame that search engines have let this task (“ctrl-f the web”) fall by the wayside. It’s probably a large part of why people think Google search sucks now, it certainly is for me. (There’s this one essay about the Harappan civilization that I used to be able to find by searching for “strange builders mist of time”, I definitively remember that exact phrase working for me many years ago, and now it does not work and I cannot find that essay anymore.)

mateo13y ago

It would be nice if it wasn't making up nearly all the links and sources I ask of it.

simonw3y ago

I tried to hint at that with "Using them as an alternative to a search engine such as Google is one of the most obvious applications—and for a lot of queries this works just fine."

I agree: I do use it as a search engine myself for a bunch of things, but those tend to be things where I've developed a strong intuition that it's likely to give me a reasonable result.

People who haven't developed that intuition yet tend to run into problems - and will often then loudly proclaim that LLMs are evidently useless and shouldn't be trusted for anything.

lupire3y ago

How useful is it if you have to know what you expect the answer to be? How do you know you are right in your calls of when to trust it? This smells like a confirmation bias machine.

1 more reply

skybrian3y ago

I think it's best used in conjunction in with a search engine. For example, you can ask it to recommend a paper to read and then search for the paper.

layer83y ago

The internet allows you to corroborate alleged facts by looking at further sources, following references, etc. Or when you end up not finding anything useful that substantiates what you’re looking for, that can serve as a conclusion as well. You get a feeling for it after a while. ChatGPT by itself however can usually only serve as a starting point. It doesn’t allow you to reach the same level of confidence that a googling session can. I would say they complement each other. But if given the choice of only being allowed to use one of the two, I would opt for the search engine.

happyzombies3y ago

This is true about the search engine, and besides things you usually google aren’t guaranteed to be 100% accurate anyways. So how is using ChatGPT any different?

Sure things in Wikipedia or official documents could be accurate, but the internet is still full of misinformation

ftxbro3y ago· 11 in thread

As a long time LLM enjoyer I just want to mention https://generative.ink/posts/simulators/ as I think it's by far the most insightful take on the GPT LLMs even though it was from before ChatGPT. It's better than blurry jpeg and stochastic parrot etc.

stevenhuang3y ago

Yup, that's a classic article.

My favorite is Chalmer's engine bit:

> What pops out of self-supervised predictive training is noticeably not a classical agent. Shortly after GPT-3’s release, David Chalmers lucidly observed that the policy’s relation to agents is like that of a “chameleon” or “engine”:

>> GPT-3 does not look much like an agent. It does not seem to have goals or preferences beyond completing text, for example. It is more like a chameleon that can take the shape of many different agents. Or perhaps it is an engine that can be used under the hood to drive many agents. But it is then perhaps these systems that we should assess for agency, consciousness, and so on.6

fnordpiglet3y ago

Yeah that’s been my thought since the beginning. We have created all sorts of agents to date that are very compelling but lack a bridge into knowledge and synthesis of complex semantics. Wiring an ensemble of the various AI models created over the last 70 years I think is the final step.

skybrian3y ago

I think that article is misleading, because a simulator has rules. An LLM is better thought of as a storyteller, because at best it's going to follow whatever implicit rules there are very loosely and let you make rule changes of your own, more like Calvinball.

Also, whatever loose rules it has are more literary than mathematical. Plot twists often work.

tbalsam3y ago

I find this to be a better explanation than "it's just regurgitating strings of text"

No, it is clearly not, and that is a very easily testable hypothesis.

Thank you for sharing.

1 more reply

ftxbro3y ago

The essay is long and complicated so I'm not sure how much of it you read closely, but it specifically addresses this distinction between the simulator and the simulacrum.

In the analogy of the essay, your argument would be like saying that reality cannot be simply the application of quantum physics, because you are allowed to make new rules like Calvinball within reality which are different from the rules of quantum physics.

1 more reply

blueblimp3y ago

I've always thought of them a bit like improv, since they tend to follow the "yes, and..." rule, by happily continuing whatever direction you want to go. Now that the base models have been fine-tuned to avoid some topics, that's less true than it used to be, but it still feels like the most natural mode of operation.

pmoriarty3y ago

One thing I've been impressed by recently is ChatGPT's ability to explain epigrams.

How can it explain them effectively if it's just a "stochastic parrot" or a "blurry jpeg"? Neither fits what it does.

Don't such effective explanations require some sort of understanding?

Below is an example explanation, and there are more here: [1]

---

Explain "In software systems it is often the early bird that makes the worm."

This statement refers to the importance of early planning and implementation in software systems. In software development, the early stages of the project, such as requirements gathering, design, and implementation, are crucial for the success of the project.

If the early stages of the project are done well, it can set the foundation for the rest of the development process, making it easier to build and maintain the software system. On the other hand, if these stages are not done well, it can lead to delays, rework, and budget overruns.

The phrase "the early bird that makes the worm" is a reference to the proverb "the early bird catches the worm", which means that those who act early and decisively will have an advantage over others.

Similarly, in software development, if the development team acts early and puts effort into the early stages of the project, they can gain an advantage over competitors, reducing the risk of project failure, and increasing the chances of success.

In summary, the statement "In software systems, it is often the early bird that makes the worm" emphasizes the importance of early planning and implementation in software development, which can set the foundation for the rest of the development process and increase the chances of success.

[1] - https://diaphora.blogspot.com/2023/04/chatgpt-explains-epigr...

dragonwriter3y ago

It literally misses any explanation of the only thing that distinguishes the epigram from the proverb it refers to. While identifying, and reproducing, with simple contextual substitutions, and explanation of the proverb referenced is not a small feat compared to where this kind of technology was a few years ago, its pretty darned far from either (1) actually explaining the epigram, or (2) making the argument you try to make based on explaining the epigram.

1 more reply

krainboltgreene3y ago

That's not even a good explanation for that epigram and the output is similar to many written pieces of content on the internet, which ChatGPT almost surely borrows from via CommonCrawl.

Myrmornis3y ago

I’d like to know who the author is and what their background/professional experience is. Is that information available? I see they use a handle “moire”.

ftxbro3y ago

it's not me and they have wanted to be pseudonymous. pls no dox

tbalsam3y ago· 6 in thread

I've seen this take a lot, and I find it frustrating as it flies in the face of the information theory underpinning how large neural networks learn information.

This is more than just a fancy zip file of Markov sequences. Someone has got to put a stop to this silly line of reasoning, I'm not sure why more people familiar with the math of deep learning aren't doing their best to dispel this particular belief (which people will then use as the foundation for other arguments, and so on, and so on, and this is how misconceptions somehow become canon in the larger body of work).

LelouBil3y ago

Can you explain more ?

I know the basics of deep learning and I found the article accurate.

tbalsam3y ago

The fuzzy jpeg analogy and related kin ignore the internal disentanglement of ideas, which is what separates LLMs from, say, a probabilistic chain producer.

I.e. one can think of it as a NERF of an underlying manifold instead of just assembling pictures taken of the manifold, which is an important distinction to make.

I.e. it learns the manifold, not the manifold samples. That's what makes it so powerful and lets it coherently mix and match very abstract concepts together. Even if it gets it wrong, one could link that to the fuzziness of a NERF where there is not as much data.

That's why this whole "average" business is silly nonsense. We're reducing the empirical risk over the dataset, not the L2 loss over it for Pete's sake.

1 more reply

trc0013y ago

I think parent is getting at the idea that dnn learns abstract representations of the data in whatever way is useful for the mode objective function. But I agree, the article still seems to present a reasonable argument

pretendscholar3y ago

As an amatuer I think Markov chains are explicitly a crude frequency association whereas what exactly a neural network is storing to predict the next token involves stored representations in neural weights which can be far more nuanced.

1 more reply

simonw3y ago

Which take are you talking about?

tbalsam3y ago

The blurry jpeg/average argument and related.

It doesn't even match the basic math of the loss function, and implies a static snapshot instead of a decomposed, dynamic system that uses disentangled components to form a solution (whether incorrectly or correctly).

I.e. I feel it really downplays the beauty of what is happening, and that is something frustrating to me, especially when it's fairly straightforward mathematically that that is not at all the case of what's happening, at least from my personal experience/perspective.

booleandilemma3y ago· 4 in thread

It makes me sad that the next time I enjoy a piece of writing, I'm going to have to wonder if it was "enhanced" or even written wholesale by ChatGPT. I don't feel the same with arithmetic at all.

sebzim45003y ago

When you read a novel, do you spend time worrying about what sections are 'as written' from the author and which sections were cleaned up by an editor?

maphew3y ago

Yes, actually I do. Not often, but with increasing frequency.

My pet theory is that editors aren't as good as they used to be. Market pressure to publish faster and faster in a vain attempt to keep up with the internet means that fewer of them are given the time and support to get really skilled. Thus resulting in ham fisted edits that jar me out of reading flow, and thence to analysing why.

(This pressure operates the other way too. Many authors' works are pushed out the door when they should have had more editing. )

majormajor3y ago

Is this a result of the machine-instead-of-human-reviewer/editor difference, or because of a sense that the writing is less a "pure" output of a singular author?

booleandilemma3y ago

Well I think it's like, you come to enjoy an author's particular style of writing, and if people are just going to use ChatGPT to write things for them, then they're not going to develop any style. Everyone might even end up all sounding the same.

With a calculator this is a feature. We want computations to be the same after all. Everyone should be able to get the same results when they enter the same numbers in. But this homogeneity doesn't belong in writing.

2 more replies

hbarka3y ago· 3 in thread

Confabulate is a better word than hallucinate. How did “hallucinating” get popularized? It’s a terrible term in this context.

leobg3y ago

Ha! Totally agree. Used it in my emails to OpenAI in the early days of GPT-3, hoping they’d adapt the term. I knew it from psychology, where it refers to people giving rational explanations for something that isn’t there (specifically in split brain patients).

I guess “hallucinate” stuck because it works across all disciplines: text, audio, vision…

jazzyjackson3y ago

I feel like it had something to do with DeepDream - in the popular consciousness, tripping on acid / hallucinating became something that computers are surprisingly good at, and maybe that transferred to text models.

hoseja3y ago

Probably from the early deep dream pictures?

stevenhuang3y ago· 2 in thread

> Language models don’t—if you run the same prompt through a LLM several times you’ll get a slightly different reply every time.

You can get deterministic output (on a given machine) by setting temperature=0. The Chatgpt interface doesn't let you do that, but the playground API does.

xigency3y ago

This is an interesting argument. It's evident that the LLM /can/ give different answers to the same question and that they are stochastic in nature. But being computer programs we can make them deterministic with respect to input as you suggest.

More to the point, I don't think a "calculator for words" should be deterministic. Operating on language is much more subjective than operating on numbers. If anything, this is a human limitation that we expect only one answer to one question. I'm a contrarian to Chomsky's philosophy, as he's always been pessimistic of statistical language processing and often approaches from the more objective-side like grammar and parsing.

I'm waiting for the point where we can tap knowledge from Deep Learning models to build rule-sets that appease the deterministic crowd (and get the insight of what an LLM is really modeling). A breakthrough here could also help with two big problems a) alignment and b) copyright.

spencerchubb3y ago

Also if you specify the seed you get deterministic output, assuming that the interface allows to specify a seed.

freediver3y ago· 1 in thread

Another powerful use not mentioned in the article is the ability to convert unstructured data into structured date.

For example you can copy paste a page describing API documentation and ask an LLM to not only make an API call but then also interpret results. This is the most fascinating use of LLMs to me so far.

travisjungroth3y ago

I think this is one of the most powerful uses cases for the next one maybe two years (I have a hard time making guesses beyond that point these days). There’s a lot of stuff you can do with structured data. Millions of existing applications. Until now, that world is connected with our everyday world with something like a rope bridge. It’s like a six lane steel suspension bridge just popped up.

teekert3y ago

"The ChatGPT model is huge, but it’s not huge enough to retain every exact fact it’s encountered in its training set."

That's because there is no way for the model to take the internet and separate fact from fiction or truth from falsehood. So it should not even try to, unless it can somehow weigh options (or preform its own experiments). And that doesn't mean counting occurrences, it means figuring out a coherent worldview and using it as a prior to interpret information, and then still acknowledging that it could be wrong.

davidthewatson3y ago

The most telling thing about the state-of-the-art currently is that somewhere between all the marketecture in no-code and low-code, we can essentially create a recursive crawler on-the-fly on par with a large cloud provider (because it is) and ask it to do recursive crawls that at least begin to approach a sensemaking machine, i.e. it can provide consensus for subjective truth agreed upon by experts above and beyond simple objective truth. To me, that's the great innovation that truly embraces and extends average intelligence by providing a prosthetic device for the brain to make inferences that would have only been approachable by high functioning individuals previously, i.e. the kind of argument about consciousness that you see from low latency inhibition researchers like Peterson. What any individual does with this is really where prompt engineering becomes the human-computer agent collaboration that should be our default mode in computing - a kind of tortoise wins the race story where we lost our minds in the race to interaction via javascript. It's not terribly interesting to watch a computer type. There's a place for batch mode (queuing, etc) , if the tools built up around it handle long running job management well. Sadly, that seems rarer to me now than 30 years ago.

skybrian3y ago

Often it's a hint generator that tells you where to look and what you could try.

The hints are not calculated from the input, they're from the training set.

trc0013y ago

Good to see more people pointing out the problems using standalone llm (I.e. not connected to an external data source/the internet) for search. So many people I talk to dismiss gpt because it can’t answer some subject-specific question accurately.

mif3y ago

I think the editorial capabilities of ChatGPT are fantastic, and the author provides a good list of examples. On top of that I would add that ChatGPT is really good as composing text. The meaning making is therefore still what we have to do.

zone4113y ago

Please stop linking to Ted Chiang's article. I like him as an imaginative writer, but his article is just wrong and gives readers incorrect intuitions. His claim that GPT models are not able to learn decimal addition has been known to be false for years and you can verify yourself that GPT-4 can do it.

transitivebs3y ago

Great analogy & breakdown.

My go-to explanation is to think of ChatGPT like a really intelligent friend who's always available to help you out – but they're also super autistic, and you need to learn the best way to interact with them over time.

gloosx3y ago

I think a better analogy would be: "ChatGPT as a Crappy Primary School Teacher", has hard-coded traditionalistic morales, and can answer anything but information quality is strictly not guaranteed.

CamelRocketFish3y ago

> if you run the same prompt through a LLM several times you’ll get a slightly different reply every time.

If it has the same seed, why would you get a different reply?

j / k navigate · click thread line to collapse

79 comments

58 comments · 16 top-level

SomewhatLikely3y ago· 15 in thread

dror3y ago

He can make the same criticism of Internet searches as he does of GPT: you shouldn't trust them until you validate them.

anotherpaulg3y ago

Gish looks really nice. I'm going to give it a try.

It seems like you've been using similar workflows to what I've been trying for coding with gpt?

https://github.com/paul-gauthier/easy-chat#created-by-chatgp...

2 more replies

akhilpotla3y ago

Speaking of AI generated pages, I wonder how OpenAI filter these low quality web pages out of their training set as they continue to training.

3 more replies

swader9993y ago

Google and others would be wise to add a date filter of "before summer 2023". Maybe a bit longer, but not much time left till AI spam really takes over.

1 more reply

rimliu3y ago

I imagine ChatGPT will go from the "future changing" to "glorified spam generator" quite quickly.

JieJie3y ago

Rebecca Jarvis interviews Sam Altman for ABC News Rebecca Jarvis, https://www.youtube.com/watch?v=540vzMlf-54

(I don't think this contradicts what you said.)

qwertox3y ago

From his interview with Lex Friedman:

Quoting what he says [0][1]:

[0] https://steno.ai/lex-fridman-podcast-10/367-sam-altman-opena...

[1] https://youtu.be/L_Guz73e6fw?t=828

stavros3y ago

Google, in comparison, returned absolutely irrelevant SEO spam.

fwlr3y ago

There’s two different things and we refer to both of them with the word “search”.

mateo13y ago

It would be nice if it wasn't making up nearly all the links and sources I ask of it.

simonw3y ago

I tried to hint at that with "Using them as an alternative to a search engine such as Google is one of the most obvious applications—and for a lot of queries this works just fine."

I agree: I do use it as a search engine myself for a bunch of things, but those tend to be things where I've developed a strong intuition that it's likely to give me a reasonable result.

People who haven't developed that intuition yet tend to run into problems - and will often then loudly proclaim that LLMs are evidently useless and shouldn't be trusted for anything.

lupire3y ago

How useful is it if you have to know what you expect the answer to be? How do you know you are right in your calls of when to trust it? This smells like a confirmation bias machine.

1 more reply

skybrian3y ago

I think it's best used in conjunction in with a search engine. For example, you can ask it to recommend a paper to read and then search for the paper.

layer83y ago

happyzombies3y ago

This is true about the search engine, and besides things you usually google aren’t guaranteed to be 100% accurate anyways. So how is using ChatGPT any different?

Sure things in Wikipedia or official documents could be accurate, but the internet is still full of misinformation

ftxbro3y ago· 11 in thread

stevenhuang3y ago

Yup, that's a classic article.

My favorite is Chalmer's engine bit:

fnordpiglet3y ago

skybrian3y ago

Also, whatever loose rules it has are more literary than mathematical. Plot twists often work.

tbalsam3y ago

I find this to be a better explanation than "it's just regurgitating strings of text"

No, it is clearly not, and that is a very easily testable hypothesis.

Thank you for sharing.

1 more reply

ftxbro3y ago

The essay is long and complicated so I'm not sure how much of it you read closely, but it specifically addresses this distinction between the simulator and the simulacrum.

1 more reply

blueblimp3y ago

pmoriarty3y ago

One thing I've been impressed by recently is ChatGPT's ability to explain epigrams.

How can it explain them effectively if it's just a "stochastic parrot" or a "blurry jpeg"? Neither fits what it does.

Don't such effective explanations require some sort of understanding?

Below is an example explanation, and there are more here: [1]

---

Explain "In software systems it is often the early bird that makes the worm."

[1] - https://diaphora.blogspot.com/2023/04/chatgpt-explains-epigr...

dragonwriter3y ago

1 more reply

krainboltgreene3y ago

That's not even a good explanation for that epigram and the output is similar to many written pieces of content on the internet, which ChatGPT almost surely borrows from via CommonCrawl.

Myrmornis3y ago

I’d like to know who the author is and what their background/professional experience is. Is that information available? I see they use a handle “moire”.

ftxbro3y ago

it's not me and they have wanted to be pseudonymous. pls no dox

tbalsam3y ago· 6 in thread

I've seen this take a lot, and I find it frustrating as it flies in the face of the information theory underpinning how large neural networks learn information.

LelouBil3y ago

Can you explain more ?

I know the basics of deep learning and I found the article accurate.

tbalsam3y ago

The fuzzy jpeg analogy and related kin ignore the internal disentanglement of ideas, which is what separates LLMs from, say, a probabilistic chain producer.

I.e. one can think of it as a NERF of an underlying manifold instead of just assembling pictures taken of the manifold, which is an important distinction to make.

That's why this whole "average" business is silly nonsense. We're reducing the empirical risk over the dataset, not the L2 loss over it for Pete's sake.

1 more reply

trc0013y ago

pretendscholar3y ago

1 more reply

simonw3y ago

Which take are you talking about?

tbalsam3y ago

The blurry jpeg/average argument and related.

booleandilemma3y ago· 4 in thread

It makes me sad that the next time I enjoy a piece of writing, I'm going to have to wonder if it was "enhanced" or even written wholesale by ChatGPT. I don't feel the same with arithmetic at all.

sebzim45003y ago

When you read a novel, do you spend time worrying about what sections are 'as written' from the author and which sections were cleaned up by an editor?

maphew3y ago

Yes, actually I do. Not often, but with increasing frequency.

(This pressure operates the other way too. Many authors' works are pushed out the door when they should have had more editing. )

majormajor3y ago

Is this a result of the machine-instead-of-human-reviewer/editor difference, or because of a sense that the writing is less a "pure" output of a singular author?

booleandilemma3y ago

2 more replies

hbarka3y ago· 3 in thread

Confabulate is a better word than hallucinate. How did “hallucinating” get popularized? It’s a terrible term in this context.

leobg3y ago

I guess “hallucinate” stuck because it works across all disciplines: text, audio, vision…

jazzyjackson3y ago

hoseja3y ago

Probably from the early deep dream pictures?

stevenhuang3y ago· 2 in thread

> Language models don’t—if you run the same prompt through a LLM several times you’ll get a slightly different reply every time.

You can get deterministic output (on a given machine) by setting temperature=0. The Chatgpt interface doesn't let you do that, but the playground API does.

xigency3y ago

spencerchubb3y ago

Also if you specify the seed you get deterministic output, assuming that the interface allows to specify a seed.

freediver3y ago· 1 in thread

Another powerful use not mentioned in the article is the ability to convert unstructured data into structured date.

For example you can copy paste a page describing API documentation and ask an LLM to not only make an API call but then also interpret results. This is the most fascinating use of LLMs to me so far.

travisjungroth3y ago

teekert3y ago

"The ChatGPT model is huge, but it’s not huge enough to retain every exact fact it’s encountered in its training set."

davidthewatson3y ago

skybrian3y ago

Often it's a hint generator that tells you where to look and what you could try.

The hints are not calculated from the input, they're from the training set.

trc0013y ago

mif3y ago

zone4113y ago

transitivebs3y ago

Great analogy & breakdown.

gloosx3y ago

I think a better analogy would be: "ChatGPT as a Crappy Primary School Teacher", has hard-coded traditionalistic morales, and can answer anything but information quality is strictly not guaranteed.

CamelRocketFish3y ago

> if you run the same prompt through a LLM several times you’ll get a slightly different reply every time.

If it has the same seed, why would you get a different reply?

j / k navigate · click thread line to collapse