undefined | Better HN

0 pointswhimsicalism2y ago0 comments

You don't benchmark foundation model against RLHF model, results aren't very useful.

0 comments

2 comments · 1 top-level

moffkalast2y ago· 1 in thread

This does seem to be a RLHF model, not a base model. Unless 'supervised fine-tuning' and 'human preference' mean something else.

Ah I see there is also a llama-2-chat model.

j / k navigate · click thread line to collapse