I was reading the GraphSLAM paper to get a sense of the algorithms used for SLAM purposes in robots. While reading it, I realized that I have a tenuous grasp on probability theory, especially on topics like covariance, conditional probability and multivariate distributions (even things like what posterior probability represents).
I'd like to rectify this and gain an intuitive understanding of the subject, since it is commonly used in numerous areas of engineering.
I dislike books that introduce fully formed theorems with no derivation or proof of how they came into existence. Which comprehensive book(s) can I read?