It is important that students understand the provenance of the inferential techniques they use so that they don't land up doing bogus science (which hurts the world) by not knowing the failure modes of these techniques. Of course not all students of statistics know the requisite mathematics to understand it all, at the very least put the failure modes into a cookbook form.
For the sake of science please don't ever do any inferential statistics without knowing when the method you're using works and when it breaks, what it is robust to, and what assumptions it makes. Statistics is really easy to break when used naively. The mathematics of statistics is not easy, and often results are highly counter-intuitive.
Lot's of good criticisms in this thread, which I'll have to look at. This one, however, is not. :) how many intro stats book, of the traditional kind, mention MLE, method of moments, biased vs unbiased estimators, etc...? None that I've seen. So, you're right, it becomes more "cookbooky" as a result, however, I would argue that all Bayes analysis follows the same recipe, whereas frequentist analysis typically follows many recipes - not obviously connected. It is that part that I criticize, not the fact that there is a recipe for doing things.
Oh - there are quite a few. Here's a small sample (no pun intended):
- Probability and Statistical Inference by Hogg & Tanis (we used this in my stats course)
- Modern Mathematical Statistics with Applications by Devore & Berk
- Probability and Statistics by DeGroot & Schervish
A quantity that follows normal distribution has two things to estimate, the variability of the quantity (standard deviation), and the mean. Both of these are estimated with uncertainty from a series of observations of the quantity (the data). The t-distribution allows us to make predictions, taking into account both sources of uncertainty for a normally distributed thing.
However, as the number of observations increases towards thirty, the estimate of the standard deviation gets really, really good, so you can happily ignore the uncertainty for that. Then you just need the normal distribution.
But, it's obviously a labor of love and it's an interesting take on intro to stats. And, from skimming it, I don't see anything in it that's wrong. So this might be a good intro to bayesian stats for most HN readers.
edit: there is a wide range of quality for the graphs, though. Some look great, but some (the histograms especially) are... unappealing. And the formatting for the code sections is quite at odds with the style of the rest of the book. Those are minor, though.
second edit: not to start a license flamewar, but can this book be redistributed? It's licensed under either CC or GNU FDL, but I don't see a way to get the source code. So anyone hosting a copy would also need to license it under the FDL (since they can't remove the FDL licensing from the pdf), which they would then be violating. Am I understanding things correctly, or am I wrong?
That's not very unusual. It seems to follow the "logic of science" approach from Jaynes. Hypothesis testing is covered in chapters 4 and 6. Other books (Mackay, Jaynes, Murphy) only cover frequentist hypothesis testing to argue against it, so this is rather refreshing.
The recent "cite crappy Whoever paper here" goof in a peer-reviewed journal is a typical example, and is notable only in that it is so egregious that it was caught and publicized. It is essentially certain that a large fraction of published papers contain at least one significant typo. I know of one case where two figures in a paper were identical (figure 2 was duplicated in figure 3) and it was missed by the co-authors (one of whom was fanatically careful) the journal editors and the referees.
We are never directly aware of our own inattentiveness, by definition, so the reality of how inattentive we are comes as a constant surprise.
To twist this vaguely back on topic: as well as being attentionally blind, we are also probability blind. I liken this to colour-blindness: we simply do not see probability distributions and have a terrible time thinking about them, yet we are completely immersed in them every day.
Between these two things--attentional blindness and probability blindness--we frequently end up interacting with the universe in ways that make little or no sense, as we behave as if we a) notice everything and b) live in a world of certain outcomes. The modern revolution of treating probability theory as logic is a huge big deal, and people who adopt it are likely to have a considerable advantage in years ahead. For one thing, it makes dealing with our attentional blindness easier, because it helps us understand and represent in our reasoning our imperfect attentional capabilities.
http://www.jstor.org/discover/10.2307/2683689?uid=2&uid=4&si...
The Statistics Online Computational Resource (SOCR) site is also amazing for actually learning and playing with common statistical tests and tools: http://www.socr.ucla.edu/
Collaborative Statistics is a free and interactive statistics textbook: https://www.kno.com/book/details/productId/txt9780983804905
You can also run Sage, R, Python, Octave (Matlab clone) and other tools right in the browser now: https://cloud.sagemath.com/
(I taught 36-201, the intro stats course that was used to build the OLI course, this summer.)
Statistical Inference, on the other hand, seems to take a Bayesian perspective and is very much not your standard intro stats class. It looks interesting and I'll have to skim through some of it.