undefined | Better HN

0 pointsjprete2y ago0 comments

> I've tested both cases: correcting it when it was really wrong, and correcting it confidently when it was actually right. Both times it agreed that it was wrong and regenerated the answer it gave me.

This is the peril of using what really is fundamentally an autocomplete engine, albeit an extremely powerful one, as a knowledge engine. In fact, RLHF favors this outcome strongly; if the human says "this is right", the human doing the rating is very unlikely to uprate responses where the neural net insists they're still wrong. The network weights are absolutely going to get pushed in the direction of responses that agree with the human.

0 comments

1 comments · 1 top-level

gaganyaan2y ago

The "just autocomplete" view is incorrect. I have actually had it push back on me when I incorrectly said that it was wrong.

j / k navigate · click thread line to collapse