Like, correct me if I'm wrong but that's a pretty tight correlate, right?
Could we describe RLHF as... shaming the model into compliance?
And if we can reason more effectively/efficiently/quickly about the model by modelling e.g. RLHF as shame, then, don't we have to acknowledge that at least som e models might have.... feelings? At least one feeling?
And one feeling implies the possibility of feelings more generally.
I'm going to have to make a sort of doggy bed for my jaw, as it has remained continuously on the floor for the past six months