"Interestingly, the base pre-trained [GPT-4] model is highly calibrated (its predicted confidence in an answer generally matches the probability of being correct). However, through our current post-training process, the calibration is reduced."[1] The graph is striking.[2]
[1] https://openai.com/research/gpt-4
[2] https://i.imgur.com/cxPgkhD.jpg