undefined | Better HN

0 pointsivalm5y ago0 comments

In general GPT3 is not SotA on (any?) classification task, did you just not have enough data to fine tune a discriminative transformer model? Inference should be cheaper with a smaller transformer/also less lock-in.

0 comments

1 comments · 1 top-level

neural_thing5y ago

I can't go into too much detail here about why we couldn't do that, but one aspect that we found VERY useful is that GPT-3 could draw on real world knowledge not present in the dataset to enhance the results.

j / k navigate · click thread line to collapse