Statistics as algorithmic summarization (opens in new tab)

(benjamin-recht.github.io)

56 pointsdiego8984y ago18 comments

18 comments

14 comments · 3 top-level

kqr4y ago· 9 in thread

> For a simple starter example, we know that every person on Earth has a height, defined as the distance from the bottom of their feet to the top of their head when standing upright.

This is a silly mistake to make in an article on statistics. Any statistician worth their salt can see the problem with an operational definition of height that starts with "bottom of their feet".

I like this article, but then it goes on to miss another very critical part of the explanation:

> What if we selected a subset at random, and used this subset to estimate the mean? That is, we could collect a random sample of individuals from the population and measure the average height of all of the individuals in the sample.

Why would we select a subset at random? Why shouldn't we try to select a representative subset? We know from experience that biological sex, country of birth, and affluence of family are three strong predictors of height. Why shouldn't we deliberately select a subset that balances out these variables in proportion to their population?

There is a good reason to do a random sample rather than try to construct a representative sample, but the article just brushes past it!

stewbrew4y ago

Are you questioning random samples? Seriously? The notion of representativeness is flawed and often does not work well in practice, i.e. it often leads to biased estimates. Unless there is a good reason for not doing a random sample (mostly for economic considerations), simple random samples are fine.

kqr4y ago

I'm not questioning random sampling. Quite the opposite, and I even said as much in the comment you responded to:

> There is a good reason to do a random sample rather than try to construct a representative sample, but the article just brushes past it!

1 more reply

hytdstd4y ago

>Any statistician worth their salt can see the problem with an operational definition of height that starts with "bottom of their feet".

I don't get it, what's the problem with that?

kqr4y ago

When field workers go out to measure people from their feet they will inevitably encounter someone who does not have feet. The ambiguity of how the definition applies to this case can easily lead to field workers doing weird things like substituting (non-randomly) people with feet for those without, or trying to guess "if this person had feet" or simply omitting that unit from the sample entirely.

Either one of those methods are fine, but leaving it unspecified is a threat to external validity.

2 more replies

omegalulw4y ago

> > For a simple starter example, we know that every person on Earth has a height, defined as the distance from the bottom of their feet to the top of their head when standing upright.

> This is a silly mistake to make in an article on statistics. Any statistician worth their salt can see the problem with an operational definition of height that starts with "bottom of their feet".

You missed the forest for the trees.

> Why would we select a subset at random? Why shouldn't we try to select a representative subset? We know from experience that biological sex, country of birth, and affluence of family are three strong predictors of height. Why shouldn't we deliberately select a subset that balances out these variables in proportion to their population?

Random subsets are core to statistics, and they are independent of representative samples. You choose random subsets so that you can compute statistics with well defined confidence intervals (as in the less randomness there is in your sample the poorer/hard to quantify your confidence would be).

You use representative samples when you want to compare groups. Representative samples should still be random per group.

thelastbender124y ago

>Why would we select a subset at random? Why shouldn't we try to select a representative subset?

Isn't this similar to the point this article is trying to make? Looking at statistics as a collection of algorithmic recipes shifts attention to how those procedures are designed, and when do they yield us useful summaries. When you want a "representative mean" rather than the population average, you'd just tweak your algorithm.

kqr4y ago

The mean from a representative sample is, if you got the representation right, "equal" to the mean from a random sample, and both are "equal" to the population average.

The problem with representative samples is that you can't know whether you've actually constructed one, and even if you have, you'll find it hard to compute the errors of your estimations.

My beef is not with thinking of traditional frequentist-objectivist statistics as a set of algorithms solving specific problems (because that's what it is), my beef is with not starting out explaining exactly when the algorithms apply and why.

When you take a black boxy algorithmic approach to something, you have to be particularly clear about things like that.

xenonite4y ago

> We know from experience that biological sex, country of birth, and affluence of family are three strong predictors of height.

I guess that an even better predictor is the parents' height.

ZeroGravitas4y ago

In some countries/eras the next generation is reliably much taller than the parents as non genetic factors dominate.

I'm not sure how the numbers play out but would expect that to be a big confounding factor given changes in global poverty.

stdbrouw4y ago· 2 in thread

The article presents algorithmic summarization as an alternative philosophy or paradigm to probabilistic modeling, but the thing is: we model when we have no choice but to model, which is any time when we can't perform a randomized experiment. I don't know a single statistician (and I'm in a department full of them) who would object to simple comparisons of two groups and few who would find the need to complicate the analysis by coming up with a full-blown causal mechanistic model, so really the author is arguing against people who don't exist.

saeranv4y ago

The author has an earlier post where he cites a particular study as an example of the overreliance on probalistic models: http://www.argmin.net/2021/09/13/effect-size/ . The author seems to think this kind of inappropriate use of models is fairly prevalent.

What are your thoughts on that mask example? To me, this seems like a reasonable critique, but (as I just posted in this thread) I don't have a deep understanding of statistics, so I am a little uncertain if my interpretation of the blog post is correct.

stdbrouw4y ago

I think that blogpost is very reasonable, and actually when I first heard about the mask study my thinking was very similar, "whoah, how do you even start to control for all of the kinds of interference that'll play havoc with the randomization?". But I don't agree with the claim that "P-values and confidence intervals associated with a regression are valid only if the model is true. What if the model is not true?" The key to statistical thinking is to think in terms of quantities rather than dichotomies, so a Poisson distribution may not be ideal, okay, but how far away from ideal is it; the randomization is not perfect, okay, but how much bias will that introduce? At the end of the day, you're still going to need a model to answer those questions.

1 more reply

saeranv4y ago

So I've read this article a couple of times and I still don't quite understand it. Can anyone summarize the key argument for a layperson?

Here's my best interpretation thus far:

This particular post is a continuation of his two earlier posts[1], [2] which seems to critique the way statistical significance is simplistically calculated from a gaussian distribution when effect size is small. This leads to an argument that the statistics community is too dependant on generative models for their interpretation, given that these models are typically to simplistic outside of hard sciences:

> "But in biology, medicine, social science, and economics, our models are much less accurate and less grounded in natural laws. Most of the time, models are selected because they are convenient, not because they are plausible, well motivated from phenomenological principles, or even empirically validated. Freedman built a cottage industry around pointing out how poorly motivated many of the common statistical models are."

This part I am a little unclear on, but it seems like this leads him to suggest focusing on the random sampling as a way to get counts that you can then plug into various statistical formulas, without the need to assume a probabalistic model:

> "So what is the remedy here? The thing is, we already know the answer: if we randomized the assignment, we can estimate log odds by counting the number of positive outcomes under treatment and control, and then just plugging these values into the odds ratio. If you do this, you find an estimate whose median is precisely equal to the true log odds. No covariate adjustment is required."

So I think, what the author means by "algorithmic summation" is that we focus on random experiment design, and discard model assumptions. Is that right?

If so I think this makes sense. I believe this is something Allen Downey has talked about before, specifically saying statistical experiments can now take advantage of cheap computational simulation to hit the large numbers needed for the sample to approximate the population, without a need for the typical model approximations developed in a pre-computational era. Downey's post here: http://allendowney.blogspot.com/2016/06/there-is-still-only-...

1. http://www.argmin.net/2021/09/13/effect-size/

2. http://www.argmin.net/2021/09/21/models-are-wrong/

j / k navigate · click thread line to collapse

18 comments

14 comments · 3 top-level

kqr4y ago· 9 in thread

> For a simple starter example, we know that every person on Earth has a height, defined as the distance from the bottom of their feet to the top of their head when standing upright.

This is a silly mistake to make in an article on statistics. Any statistician worth their salt can see the problem with an operational definition of height that starts with "bottom of their feet".

I like this article, but then it goes on to miss another very critical part of the explanation:

There is a good reason to do a random sample rather than try to construct a representative sample, but the article just brushes past it!

stewbrew4y ago

kqr4y ago

I'm not questioning random sampling. Quite the opposite, and I even said as much in the comment you responded to:

> There is a good reason to do a random sample rather than try to construct a representative sample, but the article just brushes past it!

1 more reply

hytdstd4y ago

>Any statistician worth their salt can see the problem with an operational definition of height that starts with "bottom of their feet".

I don't get it, what's the problem with that?

kqr4y ago

Either one of those methods are fine, but leaving it unspecified is a threat to external validity.

2 more replies

omegalulw4y ago

> > For a simple starter example, we know that every person on Earth has a height, defined as the distance from the bottom of their feet to the top of their head when standing upright.

> This is a silly mistake to make in an article on statistics. Any statistician worth their salt can see the problem with an operational definition of height that starts with "bottom of their feet".

You missed the forest for the trees.

You use representative samples when you want to compare groups. Representative samples should still be random per group.

thelastbender124y ago

>Why would we select a subset at random? Why shouldn't we try to select a representative subset?

kqr4y ago

The mean from a representative sample is, if you got the representation right, "equal" to the mean from a random sample, and both are "equal" to the population average.

The problem with representative samples is that you can't know whether you've actually constructed one, and even if you have, you'll find it hard to compute the errors of your estimations.

When you take a black boxy algorithmic approach to something, you have to be particularly clear about things like that.

xenonite4y ago

> We know from experience that biological sex, country of birth, and affluence of family are three strong predictors of height.

I guess that an even better predictor is the parents' height.

ZeroGravitas4y ago

In some countries/eras the next generation is reliably much taller than the parents as non genetic factors dominate.

I'm not sure how the numbers play out but would expect that to be a big confounding factor given changes in global poverty.

stdbrouw4y ago· 2 in thread

saeranv4y ago

stdbrouw4y ago

1 more reply

saeranv4y ago

So I've read this article a couple of times and I still don't quite understand it. Can anyone summarize the key argument for a layperson?

Here's my best interpretation thus far:

So I think, what the author means by "algorithmic summation" is that we focus on random experiment design, and discard model assumptions. Is that right?

1. http://www.argmin.net/2021/09/13/effect-size/

2. http://www.argmin.net/2021/09/21/models-are-wrong/

j / k navigate · click thread line to collapse