DSATCL discusses the ideas behind various cleaning and visualization approaches and several machine algorithms, but only briefly. My personal recommendation would be to first gain some experience with these topics using Python and/or R. If you're afterwards curious to find out how the Unix command line can help to do data science, well, then there's only one book I can think of! ;)
I do however find myself more and more wishing I knew data science-specific Unix commands, and I think I know what book to get to solve that problem... :)
ML and stats are generally the more flashy and well-known parts of data science, and so I've found that people new to the field often don't have major difficulties finding resources for learning them or finding the self motivation to dive into them. The data cleanup, on the other hand, is often the more important work to be done on projects while simultaneously being seen as the less enjoyable part. Learning how to do it well makes it a more interesting process, and pandas and this book lay a good foundation for that.
The appendix alone taught me most of what I know about python, and it's a great departure from the mass of online materials that focus on ML without getting into the tools you'll need for cleaning and managing data.
Plus, it's free online: http://www3.canisius.edu/~yany/python/Python4DataAnalysis.pd...
2. Business value in the ocean of data — by Fajszi, Cser & Fehér
Statistics in Plain English.
Data Analysis Using Regression by Gelman
Introduction/Elements of Statistical Learning by Jerome Friedman. I recommend reading the Introduction and using the bigger book as a reference material when tackling a problem.
Bayesian Data Analysis, 3rd edition by Gelman.
You need calc 1 & 2 and matrix algebra somewhere along the way.
Lots of papers, googling and doing. That's when you got the basics covered. You start being "operational" after Data Analysis Using Regression.
When you start working on a problem, you need to go through the relevant literature first. Nobody ix expert or even half-good in more than 2 or 3 (small) areas of statistics. Read the literature, take notes and create a plan first.