undefined | Better HN

0 pointsdragonwriter3y ago0 comments

> To do things like do accurate math, you need a different kind of model, one that is based on having actual facts about the world, generated by a process that is semantically linked to the world.

Or you just need a model that can recognize math, and then pass it to a system that can do math. Math is actually something traditional, non-AI systems are very good at doing (it is the raison d’être of traditional computing), so if an AI model can simply recognize that math needs to do be done, there is no reason for it to do the math.

0 comments

8 comments · 4 top-level

pdonis3y ago· 2 in thread

> Or you just need a model that can recognize math, and then pass it to a system that can do math.

Wolfram Alpha already does that. But that's because Wolfram Alpha is built as a model whose purpose is "recognize what kind of problem this natural language query requires, then pass it on to the problem engine for that kind of problem", where each problem engine is an actual solution model for that kind of problem, based on actual facts about the world.

ChatGPT, though, is built as a completely different type of model, whose purpose is "find a pattern that this natural language query matches, then generate a greatest probability sequence of natural language words for that pattern based on the training data set". That's a completely different structure.

zone4113y ago

It's possible to enhance mathematical abilities of LLMs by enabling them to externally run symbolic mini programs e.g. https://arxiv.org/abs/2211.12588.

It's also possible to create fact-grounded retrieval-enhanced language models e.g. https://proceedings.mlr.press/v162/borgeaud22a.html.

MerlinsSister3y ago

Sure, but that's kind of the point pdonis was making. You have to change the model by introducing symbolic elements. Neural nets alone wont get you there.

Personally I think hybridization is the way to go.

jameshart3y ago· 2 in thread

I've shared this link before, showing Riley Goodside's artful prompt-crafting, but it's again relevant to this topic:

https://twitter.com/goodside/status/1581805503897735168?s=20...

GPT-3 is perfectly capable of recognizing what kinds of things it will be bad at, and can be encouraged to generate machine-executable queries to fill in that gap.

(note this is based on prompting GPT-3, not chatGPT, but the principles about what this language model is capable of apply)

It can even go one level deeper - there's an example there of it generating a python script that uses the 'wikipedia' library to look up the date of death of the Queen, as a way to fill in knowledge it doesn't have. Tell it it can use the wolframalpha module to answer questions that involve complex units, quantities, or advanced mathematics, and it'll almost certainly do that too.

One of the things I love is this reply to that tweet - https://twitter.com/JulienMouchnino/status/15820120109127065...

"How is it possible that GPT-3 understands what a human can compute in his/her head?"

Riley's quick demo prompt that shows how well ChatGPT can guess whether particular mathematical results are easy to guess matches human intuition surprisingly well.

pdonis3y ago

> GPT-3 is perfectly capable of recognizing what kinds of things it will be bad at, and can be encouraged to generate machine-executable queries to fill in that gap.

By what seems to me to be the obvious choice for a definition of "bad at", namely "not answering queries based on an actual semantically connected world model", GPT-3 is bad at everything. And an obvious example of an endpoint of your perfectly reasonable suggestion to have it pass on queries to solution machines that are based on actual world models, is...Wolfram Alpha.

shagie3y ago

The thing is that it is good at classifying things.

You could then use it to classify things that it is bad at and use that information (as part of a larger whole) to dispatch a query to the knowledge system that can return back the proper (current) information, and report on that.

    The following is a list of questions.  Identify the category they belong to as one of {Current Events}, {General knowlege}, {Unit conversion}, {Math}:

    1. How many feet in a mile?
    2. What is the square root of 541696?
    3. Who is the Speaker of the House?
    4. How many turkeys in Turkey?

to which it responds:

    1. Unit conversion
    2. Math
    3. Current Events
    4. General Knowledge

The supervisor system (for lack of a better word) would then dispatch the questions to different systems that it is coded to be able to either further classify the question or provide the proper answer.

1 more reply

Breza3y ago

I see the GPT family of models as infrastructure, not a finished product. As the blog post describes, they can be a piece of a pipeline. For example, GPTs don't know about the high school football game that just ended, and you're not going to send a reporter to cover it. But you can apply a traditional data pipeline to feed game information into the model to generate an article about it. Every article will be written in a slightly different way, and the 100 people who read them will cover the cost of production.

sebzim45003y ago

Right, that's basically the suggestion made in the article.

j / k navigate · click thread line to collapse