Are ML and Statistics Complementary? [pdf] (opens in new tab)

(ics.uci.edu)

68 pointssnippyhollow10y ago31 comments

31 comments

19 comments · 7 top-level

p4wnc610y ago· 5 in thread

What is commonly understood as 'statistics' is just a specialized subset of machine learning. Machine learning generalizes statistics.

The correct complement to machine learning is cryptography -- trying to intentionally build things that are provably intractable to reverse engineer.

5110910y ago

Working with both statisticians and pure machine learners on the same task, I did notice some tendencies, presuppositions and modus operandi that were different (beyond being a specialized subset). Like said in this position paper, machine learners like to throw computation and parameters at the problem, where statisticians are more careful and sober. As an analogy, a statistician will approach a cliff very carefully, stomping the ground to make sure it is sturdy enough to carry a human. They'll approach the edge of the cliff 'till they have their p-measures and that is their model. Machine learners will jump head-first off the cliff and when you listen you can hear them yell: Cross-validatioooohhhh... as they plummet down.

I like the complement with cryptography. I would add another coding method: compression - Approximating the simplest model with explanatory power.

p4wnc610y ago

I have had the exact opposite experience with machine learning and statistics. In my experience, those who come from the 'statistics' side tend to use constructs, like null hypothesis significance testing, which are not consistent even from a theoretical point of view. And further, when they use them, they do awful things like p hacking, or using a direct comparison of t-stats as a model selection criterion, which are further rife with theoretical problems, not to mention lots of statistical biases and so forth.

I find the machine learning approach is far more humble. It starts out by saying that I, as a domain expert or a statistician, probably don't know any better than a lay person what is going to work for prediction or how to best attribute efficacy for explanation. Instead of coming at the problem from a position of hubris, that me and my stats background know what to do, I will instead try to arrive at an algorithmic solution that has provable inference properties, and then allow it to work and commit to it.

Either side can lead to failings if you just try to throw an off-the-shelf method at a problem without thinking, but there's a difference between criticizing the naivety with which a given practitioner uses the method versus criticizing the method itself.

When we look at the methods themselves I see much more care, humility, and carefulness to avoid statistical fallacies in the machine learning world. I see a lot of sloppy hacks and from-first-principles-invalid (like NHST) approaches in the 'statistics' side. And even when we consider how practioners use them, both sides are pretty much equally as guilty of trying to just throw methods at a problem like a black box. Machine learning is no more of a black box than a garbage-can regression from which t-stats will be used for model selection. However, all of the notorious misuses of p-values and conflation over policy questions (questions for which a conditional posterior is necessarily required, but for which likelihood functions are substituted as a proxy for the posterior) seem very uniquely problematic for only the 'statistics' side.

Three papers that I recommend for this sort of discussion are:

[1] "Bayesian estimation supersedes the t-test" by Kruschke, http://www.indiana.edu/~kruschke/BEST/BEST.pdf

[2] "Statistical Modeling: The Two Cultures" by Breiman, https://projecteuclid.org/euclid.ss/1009213726

[3] "Let's put the garbage-can regressions and garbage-can probits where they belong" by Achen, http://www.columbia.edu/~gjw10/achen04.pdf

2 more replies

Tarrosion10y ago

That's a strong and not-at-all obvious statement. Can you elaborate?

sjg00710y ago

Machine learning does not generalize statistics; mathematics does.

p4wnc610y ago

Machine learning is one subfield of mathematics that generalizes another, further subfield (statistics).

_0w8t10y ago· 5 in thread

I think feasibility to get an explanation for the results of modern machine learning is wishful thinking. I personally cannot explain my gut feelings. So why should we expect an explanation when machine deals with the same class of problems?

Besides, it is easy to get wrong explanation and, as Vladimir Vapnik in his 3 metaphors for complex world observed, http://www.lancaster.ac.uk/users/esqn/windsor04/handouts/vap... , "actions based on your understanding of God’s thoughts can bring you to catastrophe".

5110910y ago

As we start to use AI/ML for more tasks, the need for model interpretability rises. We expect doctors to explain their gut feelings, much like we expect computer vision models that detect disease to explain their findings and have a (theoretically sound) estimate of confidence.

SVM's were so popular, pretty much because they had a firm theoretical basis on which they were designed (or "cute math" as deep learners may call it). As Patrick Winston would ask his students (paraphrasing): "Did God really meant it this way, or did humans create it, because it was useful to them?". Except maybe for the LSTM, deep learning models are not God-given. We use them because, in practice, they beat other modeling techniques. Now we need to find the theoretical grounding to explain why they work so well, and allow for better model interpretability, so these models can more readily be deployed in health care and under regulation.

_0w8t10y ago

The article calls for an explanation not why some ML method works, but for an explanation of a particular ML result, like why a car drives this way or why a patient got a cancer. While I hopeful for the former, I just do not see the basis for the latter.

If some regulations shall require such explanation, the end result will be fake stories like parents tell to the children that Moon do not fall because it is nailed to the sky.

1 more reply

Retra10y ago

Just because you can't explain your gut feelings doesn't mean they are unexplainable, or that you don't have any obligation to explain them.

jupiter9000010y ago

I hear what you're saying, but in terms of usefulness for business decisions, what leaders at a company would be satisfied with someone providing Vapnik's quote? Certainly machine learning and statistics have applications outside of business, but when it comes to realities in industry settings, very often an explanation of results is necessary, in addition to an explanation of why a particular machine learning approach is best for solving a given problem, how it works to solve the problem, etc.

p4wnc610y ago

The problem is not about how to take a scientific conclusion and make it suitably less scientific to serve as a business explanation.

The problem is to replace inept employees who believe "business decisions" are not scientific questions, so that over time there is a convergence to using the scientific method, with legitimate statistical rigor, when making a so-called business decision.

Generally speaking, the only people who want for there to be a distinction between a "business question" and a "scientific question" are people who can profit from the political manipulation that becomes possible once a question is decoupled from technological and statistical rigor. Once that decoupling happens, you can use almost anything as the basis of a decision, and you can secure blame insurance against almost any outcome.

This is why many of the experiments testing whether prediction markets, when used internally to a company, can force projects to be completed on time and under budget are generally met with extreme resistance from managers even when they are resounding successes.

The managers don't care if the projects are delivered on time or under budget. What they care about is being able to use political tools to argue for bonuses, create pockets of job security, backstab colleagues, block opposing coalitions within the firm. You can't do that stuff if everyone is expected to be scientific, so you have to introduce the arbitrary buzzword "business" into the mix, and start demanding nonsense stuff like "actionable insight" -- things that are intentionally not scientifically rigorous to ensure there is room for pliable political manipulation for self-serving and/or rent-seeking executives, all with plausible deniability that it's supposed to be "quantitative."

1 more reply

tristanz10y ago· 1 in thread

LeCun has a comment on this paper here: https://www.facebook.com/yann.lecun/posts/10153293764562143

jupiter9000010y ago

Thanks for sharing that, he's got some interesting stuff to say about this topic.

nextos10y ago· 1 in thread

I think they will eventually converge.

Probabilistic programming is already a hint of this. The most general class of probability distributions is that of non-deterministic programs. ML is just a quick and dirty way to write these programs.

murbard210y ago

It's not just a way to write them, it's a way to do inference. Probabilistic programming is extremely powerful in terms of representation but inference is, in general, intractable. Yes, you can express all those ML models as probabilistic programs, but the sampler isn't going to perform nearly as well as the original algorithm.

ktamura10y ago

They definitely are as far as their roles at (most) startups are concerned.

Unless your startup's core strategy involves machine learning, statistics tends to come handier than machine learning in the early days. Most likely, what moves your company is not a data product built atop machine learning models but the ability to draw less wrong conclusions from your data, which is the very definition of statistics. Also, in the early days of a startup, you experience small/missing data problems: You have very few customers, very incomplete datasets with a lot of gotchas. Interpreting such bad data is no small feat, but it's definitely different from training your Random Forest model against millions of observations.

washedup10y ago

Here is a link to the paper referenced in the beginning: http://courses.csail.mit.edu/18.337/2015/docs/50YearsDataSci...

Great read for anyone interested in the debate.

sjg00710y ago

This is a great summary of the field.

j / k navigate · click thread line to collapse

31 comments

19 comments · 7 top-level

p4wnc610y ago· 5 in thread

What is commonly understood as 'statistics' is just a specialized subset of machine learning. Machine learning generalizes statistics.

The correct complement to machine learning is cryptography -- trying to intentionally build things that are provably intractable to reverse engineer.

5110910y ago

I like the complement with cryptography. I would add another coding method: compression - Approximating the simplest model with explanatory power.

p4wnc610y ago

Three papers that I recommend for this sort of discussion are:

[1] "Bayesian estimation supersedes the t-test" by Kruschke, http://www.indiana.edu/~kruschke/BEST/BEST.pdf

[2] "Statistical Modeling: The Two Cultures" by Breiman, https://projecteuclid.org/euclid.ss/1009213726

[3] "Let's put the garbage-can regressions and garbage-can probits where they belong" by Achen, http://www.columbia.edu/~gjw10/achen04.pdf

2 more replies

Tarrosion10y ago

That's a strong and not-at-all obvious statement. Can you elaborate?

sjg00710y ago

Machine learning does not generalize statistics; mathematics does.

p4wnc610y ago

Machine learning is one subfield of mathematics that generalizes another, further subfield (statistics).

_0w8t10y ago· 5 in thread

5110910y ago

_0w8t10y ago

If some regulations shall require such explanation, the end result will be fake stories like parents tell to the children that Moon do not fall because it is nailed to the sky.

1 more reply

Retra10y ago

Just because you can't explain your gut feelings doesn't mean they are unexplainable, or that you don't have any obligation to explain them.

jupiter9000010y ago

p4wnc610y ago

The problem is not about how to take a scientific conclusion and make it suitably less scientific to serve as a business explanation.

1 more reply

tristanz10y ago· 1 in thread

LeCun has a comment on this paper here: https://www.facebook.com/yann.lecun/posts/10153293764562143

jupiter9000010y ago

Thanks for sharing that, he's got some interesting stuff to say about this topic.

nextos10y ago· 1 in thread

I think they will eventually converge.

murbard210y ago

ktamura10y ago

They definitely are as far as their roles at (most) startups are concerned.

washedup10y ago

Here is a link to the paper referenced in the beginning: http://courses.csail.mit.edu/18.337/2015/docs/50YearsDataSci...

Great read for anyone interested in the debate.

sjg00710y ago

This is a great summary of the field.

j / k navigate · click thread line to collapse