I am very familiar with how they are trained. That doesn't change the fact that they are matrix math based on pre-trained weights. Something like RLHF makes those weights more effective but it doesn't change the fact it's autocomplete.
This is reinforced (pun not intended) by the continued issues with things like "should I walk or drive to the car wash"