undefined | Better HN

0 pointsRC_ITR2y ago0 comments

I am prone to believe that OpenAI, and organization who’s lead is centered on RL more than anything else, is quite good at getting it’s models not to spit out competitively sensitive information.

Can you get yours to give you the same verbatim?

0 comments

3 comments · 1 top-level

famouswaffles2y ago· 2 in thread

>I am prone to believe that OpenAI, and organization who’s lead is centered on RL more than anything else, is quite good at getting it’s models not to spit out competitively sensitive information.

Thanks for telling me you don't know how RL or LLMs work.

>Can you get yours to give you the same verbatim?

Sure I can. and others in this very thread have too. https://news.ycombinator.com/item?id=37805492

RC_ITROP2y ago

Ok then explain why RL can’t be used to prevent certain behaviors please.

Why can’t a reward function be used to stop a model from saying something you know you don’t want it to say?

Also you share a screenshot of a chat asking to repeat the above and that’s your proof?

Share the raw link please.

famouswaffles2y ago

>Ok then explain why RL can’t be used to prevent certain behaviors please.

Preventing certain behaviors does not mean you can make a model never output something. RL simply just doesn't work that way. In this instance, You are rating certain responses better and asking the model to predict like that. You can make it more likely to refuse a request but the idea that you can guarantee it won't is completely wrong. There is nothing open ai can do to make GPT-4 never do something. Nothing.

https://chat.openai.com/share/b7faf20c-b295-4d76-85a1-a15e04...

2 more replies

j / k navigate · click thread line to collapse