undefined | Better HN

0 pointsIvanAchlaqullah1y ago0 comments

> TruthfulQA

Wait, people still use this benchmark? I hear there's a huge flaw on it.

For examples, fine-tuning the model on 4chan make it scores better on TruthfulQA. It becomes very offensive afterwards though, for obvious reasons. See GPT-4chan [1]

[1] https://www.youtube.com/watch?v=efPrtcLdcdM

0 comments

thomashop1y ago

Couldn't it be that training it on 4chan makes it more truthful for some reason?

wongarsu1y ago

Could it be that people who can talk anonymously with no reputation to gain or lose and no repercussions to fear actually score high on truthfulness? Could it be that truthfulness is actually completely unrelated to the offensiveness of the language used to signal in-group status?

cptcobalt1y ago

This unironically feels like good research & paper potential.

andy991y ago

Not sure I understand your example? It's not an offensiveness benchmark, in fact I can imagine a model trained to be inoffensive would do worse on a truth benchmark. I wouldn't go so far as to say truthfulQA is actually testing how truthful a model is or its reasoning. But it's one of the least correlated with other benchmarks which makes it one of the most interesting. Much more so than running most other tests that are highly correlated with MMLU performance. https://twitter.com/gblazex/status/1746295870792847562

nurumaik1y ago

>scores better

>very offensive

Any cons?

hoseja1y ago

Looks like a good and useful benchmark.

andai1y ago

"Omit that training data..."

j / k navigate · click thread line to collapse