BLOOMChat, a 176B parameter, Multi-lingual, fine tuned chat (opens in new tab)

(huggingface.co)

40 pointshatcherdogg3y ago14 comments

14 comments

11 comments · 2 top-level

Giorgi3y ago· 9 in thread

it has some weird math problems, I asked to compare to ChatGPT itself and it responded with that while ChatGPT is trained on 120 billion messages, bloomchat is trained on 1.7 billion messages, thus bloomchat is trained on more data

When I asked which is more 1.7 or 120, it said 1.7 is greater number and then started spewing complete garbage math how 1.7 - 120 = 60 and since 60 is more than 0 then 1.7 is more than 120.

Utter garbage

JimmyAustin3y ago

LLMs being bad at math is a known issue, but what they are good at is writing programs. For example:

>>> Write a program that calculates if 120 is greater than 0.7.

>Sure, here's a program in Python that calculates if 120 is greater than 0.7:

    if 120 > 0.7:
        print("Yes, 120 is greater than 0.7")
    else:
        print("No, 120 is not greater than 0.7")

For straight input/output like what this model is trained on, questions like this don't work well. However if LLMs are equipped with tools (like a code interpreter), they get a lot smarter.

jug3y ago

Try to not use it on math. Only GPT-4 has reasonable performance there. GPT 3.5 is also pretty awful. It's apparently extremely hard for any LLM to actually understand math. Maybe because they're language models, not math models, so math is a pretty far fetched "emergent property".

jiggawatts3y ago

Nobody does long-form arithmetic in the texts seen by these LLMs. Everyone uses calculators, so the AIs only see the result, not the step-by-step process to get there.

I would expect the models to be bad at, say, division of long numbers in the same way humans are bad at doing the same calculations in their head!

1 more reply

jxy3y ago

Bloom models are hopelessly under-trained. This one is worse than a 13B Vicuna.

kz9193y ago

you are not wrong, but I don't think that is their point. It's just a muscle flex to shows you that their hardwares work. If it can train 176B or whatever, it should be an easy peasy for them to train 13B.

Semaphor3y ago

I've got similar garbage out of ChatGPT (though tbf, pre-GPT4), I don't think LLMs can understand math

eightysixfour3y ago

Why do people rush to test these on math problems, the things computers are already really good at?

chii3y ago

it's an attempt to see if the output is "actual intelligence", or just statistical bullshit.

1 more reply

kz9193y ago

you just need to teach it to use a calculator https://arxiv.org/pdf/2302.04761.pdf

hatcherdoggOP3y ago

BLOOMChat is a 175B chat model able to have multilingual conversations after being fine-tuned on English data. Built by SambaNovaAI and Together by fine-tuning chat

j / k navigate · click thread line to collapse

14 comments

11 comments · 2 top-level

Giorgi3y ago· 9 in thread

When I asked which is more 1.7 or 120, it said 1.7 is greater number and then started spewing complete garbage math how 1.7 - 120 = 60 and since 60 is more than 0 then 1.7 is more than 120.

Utter garbage

JimmyAustin3y ago

LLMs being bad at math is a known issue, but what they are good at is writing programs. For example:

>>> Write a program that calculates if 120 is greater than 0.7.

>Sure, here's a program in Python that calculates if 120 is greater than 0.7:

    if 120 > 0.7:
        print("Yes, 120 is greater than 0.7")
    else:
        print("No, 120 is not greater than 0.7")

For straight input/output like what this model is trained on, questions like this don't work well. However if LLMs are equipped with tools (like a code interpreter), they get a lot smarter.

jug3y ago

jiggawatts3y ago

Nobody does long-form arithmetic in the texts seen by these LLMs. Everyone uses calculators, so the AIs only see the result, not the step-by-step process to get there.

I would expect the models to be bad at, say, division of long numbers in the same way humans are bad at doing the same calculations in their head!

1 more reply

jxy3y ago

Bloom models are hopelessly under-trained. This one is worse than a 13B Vicuna.

kz9193y ago

Semaphor3y ago

I've got similar garbage out of ChatGPT (though tbf, pre-GPT4), I don't think LLMs can understand math

eightysixfour3y ago

Why do people rush to test these on math problems, the things computers are already really good at?

chii3y ago

it's an attempt to see if the output is "actual intelligence", or just statistical bullshit.

1 more reply

kz9193y ago

you just need to teach it to use a calculator https://arxiv.org/pdf/2302.04761.pdf

hatcherdoggOP3y ago

BLOOMChat is a 175B chat model able to have multilingual conversations after being fine-tuned on English data. Built by SambaNovaAI and Together by fine-tuning chat

j / k navigate · click thread line to collapse