undefined | Better HN

0 pointsrdedev2y ago0 comments

The issue is though the the line between in domain and out of domain is fuzzy. This sort of means that generalization is in a continum. Chatgpt has seen enough UI framework code that it can interpolate concepts. This is a form of generalization but people would be looking for a lot more. I guess a better way to check generalization capability is to train the model on just C++ and then see how much it can do stuff in python using only few shot examples.

Another important thing to keep in mind is one paper(wish I could remember which one it was) that showed even larger scale llms have trouble understanding that A=B is same as B=A if they have not seen A or B before

0 comments

4 comments · 2 top-level

valine2y ago· 2 in thread

Going from C++ to python is a little unfair. How many examples would a human need to learn python coming from C++? Its probably more than you can fit in a LLM context window.

rdedevOP2y ago

I know it's unfair compared to a human but I'm more interested in how much it can do. Like what level of leetcode problems can it solve and how well does it use concepts presented in the few shot applications. The whole point is to establish an upper limit on it's generalization power instead of comparing it to a human

telotortium2y ago

Not with GPT-4-Turbo 128k, at least if attention works as well at that context size as on smaller ones.

sandkoan2y ago

For anyone else interested, the paper he's referring to is "The Reversal Curse": https://arxiv.org/abs/2309.12288.

j / k navigate · click thread line to collapse