The Dunning-Kruger Effect Is Autocorrelation (opens in new tab)

(economicsfromthetopdown.com)

379 pointsgrishas4y ago195 comments

195 comments

130 comments · 43 top-level

andersource4y ago· 40 in thread

Very interesting article and statistical analysis, but I really don't see how it concludes that the DK effect is wrong based on the analysis. The fact that the DK effect emerges with _completely random data_ is not surprising at all - in this case the intuitive null hypothesis would be that people are good at estimating their skill, therefore there would be strong a correlation between their performance and self-evaluation of said performance. If the data weren't related, then this hypothesis isn't likely, which is exactly what DK means. And indeed if you look at the plots in the article (of the completely random data), they depict a world in which people are very bad at estimating their own skill, therefore, statistically, people with lower skills tend to overestimate their skills, and experts tend to underestimate it.

Also wanted to point out that in general there is no issue with looking at y - x ~ x, this is called the residual plot, and is specifically used to compare an estimate of some value vs. the value itself.

That being said, the author seems very confident in their conclusion, and from the comments seems to have read a lot of related analyses, so I might be missing something. ¯\_(ツ)_/¯

leto_ii4y ago

> therefore there would be strong a correlation between their performance and self-evaluation of said performance. If the data weren't related, then this hypothesis isn't likely, which is exactly what DK means.

DK doesn't mean no correlation, it means inverse correlation. It's the correct analysis at the bottom that shows what no correlation actually looks like (at least no correlation in tend, there is heteroskedasticity).

> a world in which people are very bad at estimating their own skill, therefore, statistically, people with lower skills tend to overestimate their skills, and experts tend to underestimate it.

Be careful here, the conclusion you drew doesn't actually follow.

> y - x ~ x, this is called the residual plot

You're giving x and y meaning that they don't have. In the article these are uncorrelated random variables - the plot of y-x ~ x will always look that way. That's however not the case if you're plotting y_hat - y ~ y_hat for a y_hat taken out of a model. That won't be a random variable in your setup.

Edit: note on heteroskedasticity

pdonis4y ago

> DK doesn't mean no correlation, it means inverse correlation.

No, it means that people's self-assessment, their prediction of what their test scores will be, is uncorrelated (or more precisely weakly correlated--that's what the original D-K data showed) with their actual test scores. Which is not what we would expect: we would expect that their predictions of their test scores would be strongly (or at least more strongly) correlated with their actual test scores. The question the D-K effect raises is why that is not the case, and it's a valid question--one which this article does not even attempt to answer.

3 more replies

uncomputation4y ago

> DK doesn't mean no correlation, it means inverse correlation

Both are wrong. DK effect describes a weak positive correlation, but weaker than we intuitively expect. The top quartile still estimates their ability better than the bottom - positive correlation. But this is still below their actual ability. The bottom quartile correctly predicts they score worse than the top, but they overestimate their own ability ie they underestimate how much better the top quartile actually performs.

andersource4y ago

> DK doesn't mean no correlation, it means inverse correlation. It's the correct analysis at the bottom that shows what no correlation actually looks like (at least no correlation in tend, there is heteroskedasticity).

Not sure I follow, inverse correlation between what? The analysis at the bottom (assuming you mean fig. 11) is too dense to show if there's a correlation or not (between skill and self-assessment bias), and looking at the relevant figure from the paper itself gives me the impression that there is a correlation.

>> a world in which people are very bad at estimating their own skill, therefore, statistically, people with lower skills tend to overestimate their skills, and experts tend to underestimate it.

> Be careful here, the conclusion you drew doesn't actually follow.

Can you elaborate? If X and Y are two independent random variables, X representing skill and Y representing self-assessment of skill, X and Y will be negatively correlated - this is exactly what the first part of the article is about, although from my perspective it's the author who's drawing the wrong conclusion.

>> y - x ~ x, this is called the residual plot

> You're giving x and y meaning that they don't have. In the article these are uncorrelated random variables - the plot of y-x ~ x will always look that way. That's however not the case if you're plotting y_hat - y ~ y_hat for a y_hat taken out of a model. That won't be a random variable in your setup.

Not following again. Other than calling x y_hat, and having y_hat be your own estimate vs. x be the subjects' estimate, what is the distinction? What do you mean by "the plot of y-x ~ x will always look that way" - what way? The shape of the plot will necessarily depend on the relationship between x and y.

2 more replies

tacitusarc4y ago

> > a world in which people are very bad at estimating their own skill, therefore, statistically, people with lower skills tend to overestimate their skills, and experts tend to underestimate it. > Be careful here, the conclusion you drew doesn't actually follow.

How does that not follow? It's just regression to the mean.

2 more replies

alecbz4y ago

The author’s confidence is itself an indication that they’re more likely to be wrong.

Kidding. Well, half-kidding, I did kind of find the tone a bit biting and dismissive, especially towards one of the commenters that were pointing out exactly what you did.

It’s an interesting question to ask whether ask whether the uniformly random data “really” exhibits DK or not, and whether that’s interesting. A world where people have 0 ability to assess their own skill and resort to making uniformly random guesses at it is kind of interesting, and of course in such a world more skilled people would end up on average underestimating themselves and vice versa.

But I think the author’s right that obviously nothing psychological is happening here. There’s the psychological effect of no one being able to assess themselves, but the fact that unskilled people overestimate themselves in this world has nothing to do with the fact that they are unskilled.

andersource4y ago

> But I think the author’s right that obviously nothing psychological is happening here. There’s the psychological effect of no one being able to assess themselves, but the fact that unskilled people overestimate themselves in this world has nothing to do with the fact that they are unskilled.

If the results from DK were similar to the random data results, I'd agree. But the DK results do show some correlation between skill and self-assessment ability.

2 more replies

_dain_4y ago

>The fact that the DK effect emerges with _completely random data_ is not surprising at all - in this case the intuitive null hypothesis would be that people are good at estimating their skill, therefore there would be strong a correlation between their performance and self-evaluation of said performance. If the data weren't related, then this hypothesis isn't likely, which is exactly what DK means.

DK effect is not that low skill people are overconfident and high skill people are underconfident. It is specifically that low skill people are more overconfident than high skill people are underconfident. i.e. if someone's estimated skill is true_skill+bias+noise, then bias_lowskill > -bias_highskill.

This is very clear in the original DK paper, they specifically focus on the supposed metacognitive deficiencies of low-skill people.

The article argues that the graphs supposedly demonstrating this fact, can also be generated from a model that does not have this difference, i.e. where bias_lowskill == bias_highskill.

EDIT: My characterization of the article is not correct, see here[1] for a visualization of the point I'm trying to make.

[1] http://emilkirkegaard.dk/understanding_statistics/?app=Dunni...

nbernard4y ago

> The article argues that the graphs supposedly demonstrating this fact, can also be generated from a model that does not have this difference, i.e. where bias_lowskill == bias_highskill.

But as I understand it, it doesn't: In the graph generated using random data, the lines intersect in the middle (bias_lowskill == bias_highskill), whereas in DK's paper they intersect in the upper right (so bias_lowskill != bias_highskill).

2 more replies

andersource4y ago

That would make sense, except that graphs generated from random data show identical bias for overestimation and underestimation, as can be seen in the article. And this is opposed to graphs from the DK paper, which show a smaller underestimation bias for experts than an overestimation bias for low-skill people. (Of course that alone doesn't prove anything, just saying that to my understanding, nothing in the article contradicts my interpretation of DK).

IceDane4y ago

I'm definitely not even close to a statistician, but I'm also having a hard time accepting this analysis.

I'll admit that part of it also comes from personal experience, at work and elsewhere. I've met some catastrophically incompetent people were completely oblivious to their own incompetence, and this has very often felt like that the more incompetent they were, the more likely they were to be try to do stuff that was waaaay out of their comfort zone, which would make even experienced, competent people tread carefully.

But even ignoring personal experiences, I'm not convinced by the arguments either. I understand what they are saying, but I don't see how this disproves the DK effect.

Even if everyone is equally bad at estimating their own skill, so that their estimate is essentially a completely random variable, then we would expect the self-assessment score average to be around 50. If I understand it correctly, this is essentially what figure 9 is demonstrating.

But that figure still says that worse performers are then likely to overestimate their own ability, just as much as it says that better performers are bad at it.

If we look at the original DK figure and contrast it with figure 9 with random data, then I think one way of interpreting the differences is that, yes, worse performers are indeed bad at self-assessment, but they're just kind of bad at it as if their self-assessment is a completely random variable. It then seems to keep being essentially random but as people's skills improve, the distance between their score and their self-assessment becomes a bit tighter.. so in conclusion: most people are pretty bad at self-assessment, but skilled people are a bit less so.

The end result is still that people in the bottom quartiles are going to over-estimate their own ability.

I don't know, maybe this is way out in the weeds. Please school me.

motoboi4y ago

The feeling your got from looking at the graph come from the idea that the red line represent absolute values. It’s actually an average of the real value as that is a really big difference and of the reasons why it’s very easy to lie with statistics.

To correct correlate the two variables (assessment and actual-score) you need to correlate the actual data, not measures of its caracteristics (average being one of them).

The actual data is shown. Even by eye it’s possible to see no strong (or significant) correlation exists.

denton-scratch4y ago

/me not a statistician either.

If people are all pretty-much crap at estimating their own skill, then you'd expect all estimates to be roughly their actual skill, plus-or-minus some random error-margin with some kind of probabilistic distribution (Gaussian?).

If that were the case, then high-skilled people would be more likely to underestimate their skill (because their actual skill is greater than the mean). That (I think) is an example of "reversion to the mean".

If that reasoning is right, then the DK claim may be true, but it says nothing about the comparative estimating propensities of high-skilled and low-skilled people. It just says that low-skilled people tend to overestimate their skill, and vice-versa. But that's exactly what you'd expect, amirite?

cestith4y ago

I think part of the issue here is you're talking about people estimating future accomplishment where they have no skill, and all variations of these experiments are people assessing how they performed on a test they just took. There's very likely a difference between someone doing open-ended talking wildly in a field where they don't even know what's possible and an introspective assessment of themselves on a concrete task they just completed.

1 more reply

BlueTemplar4y ago

You seem to be forgetting about under-estimation, so your conclusions don't follow from your premises ?

> most people are pretty bad at self-assessment, but skilled people are a bit less so

This is pretty much the conclusion of the article, except that it isn't the tautologic DK or figure 9 that shows it, but Nuhfer's figure 11.

bouncycastle4y ago

The problem is that they calculated each person’s ‘self-assessment error’ with the actual test score. This error is the difference between a person’s self assessment and their test score,

This is like comparing x - y to x, and if you do this, you will get a correlation no matter what.

1 more reply

js84y ago

You might be biased towards avoiding catastrophic risk. Somebody who doesn't know what they are doing without knowing it is more dangerous than somebody who knows what they are doing yet taking precautions as if they don't.

andersource4y ago

This is my understanding as well.

uldos4y ago

Unskilled people are more random with their self assessment that skilled. It has nothing to do with unskilled people thinking that they know everything.

andersource4y ago

> Unskilled people are more random with their self assessment than skilled

This is a statistical claim, supported by the DK graph (but not the random data thought experiment from the article).

> It has nothing to do with unskilled people thinking that they know everything

This reads to me as a claim about the psychological reason for the statistical pattern, which I don't think is either supported nor contradicted by data, both in the article and in the original paper.

3 more replies

kenjackson4y ago

> Also wanted to point out that in general there is no issue with looking at y - x ~ x, this is called the residual plot, and is specifically used to compare an estimate of some value vs. the value itself.

This article seemed very unconvincing -- and this part noted above, early on in the article set the tone that I felt like the author didn't know what they were doing. And even after reading it all, I felt like the standard lay use of DK remained valid.

This just felt like the type of thing I would have thought about as an undergrad, started to write it, and then realized it didn't make sense halfway through it. Or maybe I just missed something as well...

usefulcat4y ago

IMO the most interesting thing is not so much that you can get DK from noise, it's that the Nuhfer study was utterly unable to replicate the DK effect. If DK is real, there should have been at least a hint of it visible in the Nuhfer study.

Accujack4y ago

>the author seems very confident in their conclusion

They are, but honestly all that can be concluded safely IMHO is that the original D-K graph doesn't support that the widely discussed "effect" which their conclusion describes exists. Therefore unless there is more evidence from some other subsequent study there may not be any evidence for it at all, and if that's the case then there's potentially no proof it exists.

However, even if you prove that their data is not evidence, that doesn't actually say anything about whether the effect exists or not, just that the D-K paper isn't evidence of such an effect.

I don't think that's enough for the author to conclude that "DK is autocorrelation". A more careful conclusion would be that "the DK data do not support DK's conclusion"... but of course that's much less likely to attract click throughs.

hgomersall4y ago

This is it. You need to write down the model then perform the inference. With some rudimentary munging you can get a super simple linear model.

uoaei4y ago

> the intuitive null hypothesis would be that people are good at estimating their skill, therefore there would be strong a correlation between their performance and self-evaluation of said performance

That's not really the intended interpretation of "null" in "null hypothesis". "Null" does not mean "contrary to the effect you're testing". "Null" means "do not assume dependencies anywhere" and so your description is backwards.

andersource4y ago

"Null hypothesis" comes down to agreeing on a prior that is "reasonable". Most of the time, that indeed means not assuming dependencies, e.g. when testing the outcomes of a medical treatment. But that's not always the case, e.g. does it seem reasonable to you to assume a priori no dependency between a person's age and their height? Does the result "People get taller until the age of 20" merit a journal article? There's a very strong correlation there after all.

As I've written elsewhere this all comes down to the prior, based on my life experience the prior "people are generally capable of self-assessing their performance" is much more likely than "people have absolutely no ability to self-assess their performance."

To state it differently, when you assume no dependencies anywhere, you've already jumped to a conclusion that is more far-fetched to me than the results of DK. Do you really think people have zero ability to self-assess their own performance? In all domains and all contexts, as this is your "null hypothesis"?

1 more reply

hgomersall4y ago

Null means "that thing which I've decided to attach special significance to". There's in general no reason to prefer a null hypothesis over any other hypothesis.

1 more reply

bitshiftfaced4y ago

Check out Nuhfer et al 2016, who had a different explanation for why Dunning Kruger wasn't true.

Dunning Kruger effect: lower performers overestimate their ability, and higher performers underestimate their ability.

How did they find that? They asked participants to take a test and then had them do a self-assessment. Both were standardized from 0-100. They rated a participant's self-assessment accuracy by "self-assessment minus test score."

What's wrong with that method? You can't arrogantly self-assess as though you got a 130, and you can't humbly say that you got -50. Because of the standardization, you're bound by 0 and 100. This method makes it almost impossible for higher performers to overestimate their ability and for lower performers to underestimate.

What they actually found was that higher performers tend to be better at self-assessment. Lower performers are less accurate, but in both directions (not just overconfident).

brnaftr3614y ago

Author includes reference to Nuhfer, related studies can be found here:

https://digitalcommons.usf.edu/numeracy/vol9/iss1/art4/ https://digitalcommons.usf.edu/numeracy/vol10/iss1/art4/

krick4y ago

You've been corrected about "what DK means" in the other comments, but this is not quite the point of the post. This is not about if DK (as expressed in English words) is true or not — in fact, author points out in the beginning that it's one of these "everybody knows it's like that" ideas (as it often is with social psychology).

The point is, that the original DK paper is bullshit. At least, this plot is. And people tend to miss it, until they start to carefully read the labels and think about the caveats. In fact, as presented here it looks like it shouldn't even be accepted as a valid study, this is outright deceptive, maliciously so. If there is assumed to be a correlation between x & y, how about we start by plotting x against y then? I know, it may be messy. It almost certainly will be. Because of that, I personally won't even be offended (but some people might) by you removing the outliers and producing the unnaturally clean version of the plot in the end to highlight the main idea. Then some statistical tests to make the results quantified. But here we see nothing, it really is just comparing x to x.

IMO, this is pretty much the invariant of most of the problems of academic research in the last God-knows-how-many decades (maybe always was, I don't know). Computer science papers without the code. Data science papers without the data. Yeah-yeah, I've heard hundreds of excuses why researchers do it like that. But it's pointless, such "research" shouldn't be accepted by anybody. Either you make your findings actually public by providing everything to replicate every single step of your study (which is supposed to be the point), or you just don't publish anything and keep the research proprietary (I mean, obviously it's never black and white, there always will be concerns about test-subject anonymity, etc. — but it's ridiculous to discuss that when the accepted standard even in "proper" sciences are 20 pages of dense text which might never even get to the point of the study, i.e., actually showing the data to any extent.)

andersource4y ago

I strongly disagree, not necessarily with everything (e.g. I don't have access to the raw data from the DK experiment, don't know how well they performed all the analysis leading to the plot). But the plot itself is not inherently deceptive, and, unlike implied in the article, is not equivalent to "just comparing x to x". The plot essentially shows the actual performance vs. self-assessment of performance, compared to what we would expect if there were perfect correlation.

1 more reply

pdonis4y ago

> I really don't see how it concludes that the DK effect is wrong based on the analysis.

Neither do I. Basically what the article actually shows is that these two statements are equivalent:

(1) People with low test scores tend to overpredict their test scores, while people with high test scores tend to underpredict their test scores.

(2) People's predictions of their test scores are uncorrelated (or more precisely very weakly correlated [1]) with their actual test scores.

This is not a statement that the D-K effect is wrong. It's just restating what the D-K effect is in different words. All the talk about "autocorrelation" is just another way of saying that, if people's predictions of their test scores are only weakly correlated with their test scores, then people with low test scores will have to overpredict their test scores (because there's virtually no room to underpredict them--there's a minimum possible test score and their actual score is already close to it), and people with high test scores will have to underpredict them (because there's virtually no room to overpredict them--there's a maximum possible test score and their actual score is already close to it). But the real question is: why are x and y so weakly correlated? Why are people's predictions of their test scores so weakly correlated with their actual test scores? That is not what one would intuitively expect. That is the question the D-K effect raises, and the author not only doesn't answer it, he doesn't even see it.

Also, this statement in the description of the Nuhfer research doesn't make sense:

"What’s important here is that people’s ‘skill’ is measured independently from their test performance and self assessment."

Um, the test performance is the people's "skill". And in the original D-K research, it was "measured independently" from the people's self-assessment (their prediction of their test performance).

[1] Notice that in the "uncorrelated data" graph, Figure 10, the red line is basically horizontal. That's what you get when x and y are uncorrelated. But in the original D-K graph, Figure 2, the thick black line is not horizontal--it slopes upward. That's what you get when x and y are weakly correlated. If the author had put in a weak correlation between x and y in his own experiment, he would have gotten a graph that looked like Figure 2. But of course that still would do nothing to explain why x and y are so weakly correlated, which is the actual question.

hgomersall4y ago

My issue with the thought process is the general picking of uniform distributions to show random data that follows DK. Uniform distributions are kind of uninteresting in this case because they're pretty artificial. It's not like competence is actually measured in the range 0-100 in anything that actually matters.

msrenee4y ago

Please feel free to tell me why my interpretation is wrong. I understand stats just well enough to get myself in trouble.

The line for actual ability is basically x=y. If you scored 10%, you're in the bottom quartile. If you scored 100%, you're in the top quartile. That line isn't really data, just something for comparison. The perceived ability line is the one that utilizes the data. It seems to show that once you average out what everyone rated themselves, it ends up kind of in the middle between ~55-70%. So the people who scored 10% assumed, on average, they would score around 55%. The people who scored 100% assumed, on average, that they would score about 75%. That makes the average expected score much higher than the actual score on the low end and somewhat lower than the actual score on the high end.

I'd interpret this as the bottom quartile thinks they're average and the top quartile thinks they're a bit above average. So basically everyone thinks they're average-ish, but the people who did worst on the test were the most wrong about that. But then again, I can't remember what the questions on the test were even about and just the single graph isn't terribly useful to argue over because it's missing all of the context of the paper.

Now that I've sat and interpreted the graph using my own set of notions about what the numbers mean and what the graph actually shows, I feel like this ought to be used as one of those life lessons about how quotes and diagrams outside of their context within a paper are the epitome of the phrase "lies, damn lies, and statistics." Statistics aren't always lies, but they're incredibly easy to bend to your own biases and assumptions.

omnicognate4y ago

The fact that the statistical artifact is seen in completely uncorrelated data is only shown as a demonstration that it is not itself evidence of the claimed effect. To gather evidence that the effect doesn't actually exist you need a new experiment, not just a new analysis of the same data, because the the different levels of actual skill need to be established separately from establishing the error in skill self-assessment. In the original experiment the test results were used to establish both, which is not sufficient.

But the article presents the results from just such a new experiment. In this one they used university education level (sophomore through to professor) and measured skill self-assessment level within those groups. Higher education level (a good proxy for skill on the test used, which was about science literacy) was found to be associated with more accurate skill self-assessment but the bias of lower-skilled people overestimating their skills was not observed.

That's just one study of course, but it sounds like a much better designed one than the original and does constitute actual evidence that the Dunning-Kruger effect doesn't exist.

andersource4y ago

> The fact that the statistical artifact is seen in completely uncorrelated data is only shown as a demonstration that it is not itself evidence of the claimed effect

I don't understand this part. "Completely uncorrelated data" is usually taken to represent the null hypothesis, but that's not the case here. In the DK paper, the implicit null hypothesis is "people of all skill levels are good at estimating their performance". In this case the "completely uncorrelated data" matches an alternative hypothesis, "people's skills have nothing to do with their ability to estimate their performance in tasks testing that skill". This hypothesis doesn't outright contradict the DK proposed hypothesis (and is certainly not the DK null hypothesis), so getting similar results is unsurprising to me, and I'm not sure that we learn from it anything about the DK results.

As for the other study cited, the figure shown in the article doesn't give a lot of information on density, and looking at the paper itself, figure 4 does actually seem to show that self-assessment gradually shifts left with increasing level of education.

(Edited for accuracy).

1 more reply

ohwellhere4y ago

It depends on what one means by the "Dunning-Kruger effect."

I had the same impulse that the analysis did not disprove DK, but after sitting with it for overlong I agree with the analysis.

I think there are two competing DK effect definitions that are being conflated, one descriptive and one explanatory:

1. DK shows that less skilled people overestimate their ability, and highly skilled people underestimate it

2. DK shows that people's estimation of their ability is causally determined by their actual ability

I believe you are claiming, correctly, that the article does not disprove the first definition that explains the observation, but I think the article is trying to disprove the second definition that explains why it occurs.

In other words: Yes, there is an observable Dunning-Kruger effect in the sense that we're bad at self evaluation. Is that effect attributable to one's actual level of competence? The evidence for that appears to be a statistical artifact, and further experiments seem to disprove that conjecture.

I'm not a statistician or a psychologist.

longtimegoogler4y ago

Yup. THat's what I was going to say. There data suggests that everyone kind of estimates their ability similarly so that more skilled people underestimate there ability (impostor's syndrome) and less skilled people overestimate the their abilities.

a-dub4y ago

i immediately jump to think that replacing all the data with noise is a pretty good null hypothesis (at least for the analysis). is that not true?

rawgabbit4y ago

You're changing the definition of the null hypothesis. What you are essentially saying is that there is no need to perform experiments and studies, I can arbitrarily grab data from anywhere and if it fails to support the hypothesis... then the hypothesis is wrong.

1 more reply

TimPC4y ago· 8 in thread

The article is correct. The effect is statistical not psychological. It emerges even from artificial data and occurs independently of the supposed psychological justifications even for data where those justifications are clearly removed.

If you adjust the experiment design to avoid introducing the auto-correlation you get data that doesn't show the DK effect at all. Some might take issue with the adjusted experiment as using seniority related categories like "sophomore" and "junior" as skill levels has its own issues. To show the DK effect is real you need to come up with a better adjusted experiment that avoids the autocorrelation while still generating data that generates the effect. It's unclear if that's possible.

hungrygs4y ago

Just anecdotal, but my life observation of DK is often highly intelligent and competent people in a particular field who then generalize that to pontificate and proclaim, directly or indirectly, superior understanding to certified domain experts (e.g., have directly related advanced degree(s), work in the field for decades.) It thus seems more or as much a psychological effect - in short, people with a personality type of superiority and know-it-all, yet have never done the deep and hard work to gain or demonstrate any competency in said areas. A common side observation is of course unfounded conspiracy theories, that the derided experts have sinister intentions.

TimPC4y ago

Anecdotes are not data. Even in plural form. We can witness isolated instances of what seems to be a phenomenon without it actually being part of a phenomenon. Other posters have already established that some experts can overestimate their expertise as well. The study mentioned by the original post and in my comment seems to suggest that the overestimation bias is prevalent across a wide range of cohorts of expertise. Senior students are just as likely to overestimate their talents as freshmen for instance. This effect likely extrapolates to experts as well it's just hard to get good data.

No DK doesn't say no bluster, no proclamations or no artificial assertions of expertise. It doesn't even say that the overestimates are just as prevalent among experts as laypeople. All it says is as near as we can tell the effect size of the overestimation is the statistical autocorrelation and our best efforts to produce the same effect without relying on the autocorrelation have failed.

I think there are a lot of ways to accept the anecdotes you mentioned occur that need much weaker assertions than DK as a psychological phenomenon and would hesitate to jump to DK based on that information.

derbOac4y ago

To be fair, I think the counterargument is that ostensible experts can overstate their ability/skill/knowledge to detrimental effect just as easily, and by virtue of the label ignore the reality of the argument/scenario at hand. That is, experts can overlook mistakes they're making, or conflicts of interest, etc because of their status, and because they overestimate their own ability. There's been studies of this in group decision making in crisis situations, where hierarchies can cause failures because the "leader" becomes overconfident and fails to heed warnings by others in the group.

This all gets really murky quickly in practice because of what "low" and "high" competence means, and what constitutes the actual scope of expertise with reference to a particular scenario.

NaturalPhallacy4y ago

Classic example:

>The V-tail design gained a reputation as the "forked-tail doctor killer",[16] due to crashes by overconfident wealthy amateur pilots,[17] fatal accidents, and inflight breakups.[18] "Doctor killer" has sometimes been used to describe the conventional-tailed version, as well.

https://en.wikipedia.org/wiki/Beechcraft_Bonanza

sweezyjeezy4y ago

https://xkcd.com/793/

knorker4y ago

> The effect is statistical not psychological.

It is, though. This article says that if people are bad at estimating their skill (towards randomness / the midpoint), then bad people will overestimate, and good people will underestimate.

The psychological part is that people indeed will assume they're closer to the mean than they actually are. DK effect would not be seen if people correctly estimated their skill, nor if experts overestimated, nor if the incompetent underestimated.

ImaCake4y ago

Yes you are right and I don't understand how so many commenters in this thread can be so confidently state that the article is wrong. I just went ahead and did the random simulation myself and you get the "Dunning Kruger effect" which is exactly what the author paints it as: autocorrelation.

>If you adjust the experiment design to avoid introducing the auto-correlation you get data that doesn't show the DK effect at all.

This is even in the article! Yet some people are making claims against this despite references to the contrary.

doetoe4y ago

The article shows that the qualitative claim, that low scoring people tend to overestimate, and high scoring people tend to underestimate their ability, is nothing but a statistical obviety.

For the Dunning-Kruger effect to have psychological significance, you must quantify this, and show that they overestimate their abilities more resp. less than expected.

Regardless of how you set your expectation/null hypothesis, the absence of the effect would mean that the lowest scoring quartile would on average estimate their abilities to lie at 50% or below. It is found however that people in the lowest scoring quartile position themselves in the third or fourth quartile.

I'm not saying that this is necessarily deep or unexpected, just that the article only shows that the qualitative statement is true regardless of any psychological factors, and not that the Dunning-Kruger effect doesn't exist

1 more reply

jakear4y ago· 6 in thread

Article seems to be saying “DK doesn’t exist because it always exists”. Which is… absurd?

The point of DK is that when you don’t know shit, any non-degenerate self assessment will result in overestimating your ability. In short, “there are more natural numbers above smaller natural numbers than bigger ones”. This doesn’t have to do with psychology, and it’s expected that it appears when evaluating random data. That’s a good thing! It means DK exists even when us pesky humans aren’t involved at all, not that DK doesn’t exist at all.

larwent4y ago

Seems pretty simple. When we create upper and lower boundaries to some score, people with lower scores have more space to overestimate and those with higher scores more space to underestimate, causing the perceived score to trend towards the mean.

I think there's both a component of numbers and psychology here. If the dispersion in perceived score caused by inaccuracy is wide enough to touch the bounds, it will force a trend towards the mean. This effect is possibly exacerbated by a tendency of perception to stray from "extremes", so subjects with a score near the edges will trend to the mean more strongly as they are unlikely to rate themselves the very best or very worst.

ewzimm4y ago

This seems pretty simple to correct, so I'm skeptical that nobody has done so yet in these experiments. If true, it's an equally interesting oversight as the Monty Hall problem. The basic premise is that the structure of an experiment will naturally nudge randomness in a particular direction, and we need to adjust for that in the analysis. Everyone who does this type of work should know this.

In a simplified experiment where we give people a 3 question quiz, those who got 2 questions right have one overestimation option, 3, and two underestimation options, 0 and 1. So it's very easy to adjust for autocorrelation by checking if a large group of 2-scorers underestimate more than twice as often as they overestimate. Then we see how their tendencies compare against 1-scorers and how they deviate from naturally overestimating more than twice as often as underestimating.

I haven't reviewed these types of papers, but if nobody made even that basic adjustment in their analysis, how many others have been missed in experiments like this?

stevage4y ago

It would be possible to rule out that effect in an experiment.

1 more reply

erikmolin4y ago

I share your sentiment that DK being a statistical phenomenon actually makes the effect more interesting. I also, however, think that you are being too semantically liberal when you insist that this proves the effect exists. The DK effect is a _psychological_ effect, with a whole lot of psychological theories around it's causes. If the cause is statistical, the psychological effect can no longer be said to exist.

The stasticial phenomenon exists, surely - but I think it will be very confusing for everyone to re-use the same name.

cortesoft4y ago

> In short, “there are more natural numbers above smaller natural numbers than bigger ones”

I get what you are trying to say, but this isn’t true… every natural number has the same amount of numbers greater and smaller… an infinite number.

jakear4y ago

Natural numbers are not the same as integers. They’re strictly positive.

(Now rate your confidence in making that assertion.)

1 more reply

PaulKeeble4y ago· 4 in thread

Modern Psychology is having a lot of these sorts of results over the last decade, none of their methods are holding up under proper scrutiny. They are struggling to reproduce findings but more critically even the reproduced ones are turning out to be statistical and mathematical errors like shown here. Some of the findings have also done severe harm to patients over the decades as well, I can't help but think we need a lot of caution when it comes to psychology results given its harmful uses (such as the abuse of ill patients) and its lack of truthful results.

Phileosopher4y ago

I'm convinced it's associated with the methodology of how psychology has approached matters.

In the world of biology, you're observing the world around you. Same for physics, chemistry, et al. This means that you can set up proper controls to obscure your own presence from any potential results (e.g., isolate everything in another room, use cameras to avoid being near animals, etc.)

Psychology has the same nightmare as quantum physics: pre-existing thoughts and beliefs literally define what results you end up with.

I'm convinced that psych is a victim of the "new" way of doing science: treating the Scientific Method™ as a self-evident concept instead of regarding science as a vastly certain domain of metaphysics.

BlueTemplar4y ago

Well, all sciences have this issue : see Kuhn, Feyerabend...

https://samzdat.com/2018/05/19/science-under-high-modernism/

It would be kind of ironic if psychologists were more susceptible to the "Pop-Baconian" simplification of science ?

And damning if what I've heard about psychology going through multiple paradigms during the 20th century alone is true.

But then, indeed, also understandable, due to the "softness" and "paradigmlessness" of the subject matter, as Kuhn had pointedout the later back then ?

It's still sad how we now have detailed theories and histories of science, but in practice scientists show no interest in trying to learn from them, nor the mistakes of their predecessors?

But then maybe that would have too high of a cost. (Consider all those successful projects where the founders later say : "we didn't knew what we were getting into / that this was considered impossible".)

Bonus : Wiseman & Schlitz’s attempts to do an adversarial super-controlled parapsychological experiment : ( IV. )

https://slatestarcodex.com/2014/04/28/the-control-group-is-o...

aidenn04y ago

So many psychology experiments are so fantastically underpowered, that if the effect they are attempting to measure were real, odds are that it would have to be of such a large magnitude that it would completely overturn all of our beliefs on how humans function. And they can publish with a P value of 0.049.

So, implicit in the standards for publishing new research in psychology is "We think there is much greater than a 5% chance that our entire field is wrong" which is not a great place to start from.

roguecoder4y ago

Maybe asking 45 Cornell undergraduates and then generalizing that to "people" was a bad idea after all.

ithkuil4y ago· 4 in thread

In other words people are quite bad at estimating their skill level. Some people will overestimate, while some other people will underestimate and on average there will be a relatively constant estimated skill level that doesn't change all that much based on the actual abilities.

Given that fact, it logically follows that people who score low ability tests will more often than not have overestimated their ability (and the same on the other end of the spectrum).

You can frame this effect as autocorrelation if you wish or just as a logical consequence. But that's missing the point.

The point is: why on earth are humans so bad at estimating their own competence level as to make it practically indistinguishable from random guesses.

d0mine4y ago

- what DK claims: there is bias (incompetent people overestimate their ability) - what data actually shows: there is a greater variance (incompetent people both over and _under_ estimate to a larger degree compared with more competent people. Data shows heteroscedasticity. No bias (estimations are around zero +/-, tighter for more competent).

ithkuil4y ago

The data supports the claim because indeed it turns out that incompetent people overestimate their ability. This phenomenon exhibits itself with random data too, so it clearly doesn't mean that incompetent people overestimate their ability because of their incompetence.

Or is it?

The trick lies in the fact that when asked to judge your competence you're given a range (e.g. 0-10) and both competent people and incompetent people have access to the whole range when taking a self-assessment. I.e. if less competent people were on average more aware of their incompetence they may be less likely to rate themselves 5 or 6, but yet the data shows that no matter what competence level you have on average you self-assess more or less the same.

This seems to imply that your incompetence indeed doesn't allow you to truly appreciate the full range of skills that are required to reach a higher level of competence.

In other words, the DK effect itself is the cause of the random distribution of the skill self-assessment (which in turn is the cause of the overestimation secondary effect)

kizer4y ago

That’s what I was thinking; if the average is about constant all you’ve shown is that everyone is bad at self-assessment (another issue - not fully qualifying a distribution by just using the average loses information).

But a comment above quoting the more recent paper presents a contradictory conclusion: that humans can self-assess with some accuracy. So now I’m confused again.

d0mine4y ago

You might have meant this comment https://news.ycombinator.com/item?id=31039901 (it says people can self-assess (no bias), more competent people do it better (less variance))

oceliker4y ago· 4 in thread

I think the gist of the article is this:

Suppose you make 1000 people take a test. Suppose all 1000 of these people are utterly incapable of evaluating themselves, so they just estimate their grade as a uniform random variable between 0-100, with an average of 50.

You plot the grades of each of the 4 quartiles and it shows a linear increase as expected. Let's say the bottom quartile had an average of 20, and the top had 80. But the average of estimated grades for each quartile is 50. Therefore, people who didn't do well ended up overestimating their score, while people who did well underestimated it.

In reality, nobody had any clue how to estimate their own success. Yet we see the Dunning-Kruger effect in the plot.

andersource4y ago

That's the way I understand the statistical analysis, and in my view this exactly supports (not contradicts) DK:

> In reality, nobody had any clue how to estimate their own success.

Wouldn't that mean unskilled people tend to overestimate their skill, and experts tend to underestimate it? Why is there a contradiction with DK's conclusions?

oceliker4y ago

> Wouldn't that mean unskilled people tend to overestimate their skill, and experts tend to underestimate it?

I think it's because the original paper speculates far beyond it:

> The authors suggest that this overestimation occurs, in part, because people who are unskilled in these domains suffer a dual burden: Not only do these people reach erroneous conclusions and make unfortunate choices, but their incompetence robs them of the metacognitive ability to realize it.

The argument about autocorrelation says this "dual burden" doesn't need to be there to observe the effect.

1 more reply

laszlokorte4y ago

Someone who has a skill=0 can not underestimate and someone with a skill=100 can not overestimate. So by the framing of the question alone the participants are nudged to estimate there own skill "more averagely".

fullshark4y ago

Yeah we learned people are bad at giving themselves percentile rankings apparently, especially when the population is illdefined ("your peers").

https://www.avaresearch.com/files/UnskilledAndUnawareOfIt.pd...

jtc3314y ago· 3 in thread

Because of the effect that is actually found (variance is higher the less achievement) it follows that people you encounter who wildly overestimate their ability are more likely to people who are poor performers (the same is true for the inverse, but they obviously don't stand out anecdotally to us).

IMO that explains why Dunning Kruger seems intuitively correct even if the conclusion they drew isn't actually correct.

caylus4y ago

> people you encounter who wildly overestimate their ability are more likely to people who are poor performers

How is this helpful? You won't know whether someone is "overestimating" their ability until you learn both their estimated and actual performance, at which point you don't need to guess whether they're "likely" to have poor actual performance.

jtc3314y ago

It's an explanation for why our anecdotal intuition concludes what it does here.

I think you're misreading the point of my comment.

seventytwo4y ago

Agree. This should have been the original conclusion of DK, if they hadn’t made the mistake.

Another way to show this would have been to keep the auto correlation plot, but compare it to the same plot with statistical noise. With infinite random data, the expected value for self-assessment would be 50% score, regardless of actual score - a flat line through the chart. It would then be significant to find a non-flat line, as DK did.

It’s not inconceivable that with a smaller sample, you’d get come biasing, where lesser skilled people would over estimate, and higher skilled people under estimate.

The follow up studies seem to suggest there’s not really a bias like that, but that there is a “honing” of the general ability to estimate your own outcome, which makes sense.

> Although there is no hint of a Dunning-Kruger effect, Figure 11 does show an interesting pattern. Moving from left to right, the spread in self-assessment error tends to decrease with more education. In other words, professors are generally better at assessing their ability than are freshmen. That makes sense. Notice, though, that this increasing accuracy is different than the Dunning-Kruger effect, which is about systemic bias in the average assessment. No such bias exists in Nuhfer’s data.

apienx4y ago· 2 in thread

> Collectively, the three critique papers have about 90 times fewer citations than the original Dunning-Kruger article.5 So it appears that most scientists still think that the Dunning-Kruger effect is a robust aspect of human psychology.6

Critiques cite the work being critiqued (yes, the referenced critiques in TFA cite the Dunning-Kruger study). Also, a 23 year-old paper will inevitably get cited more than 6 year-old papers. But yeah...the inertia in Science is real. That conservatism's a feature, not a bug.

Psychology's probably the discipline with the shortest "half-life of knowledge. https://en.wikipedia.org/wiki/Half-life_of_knowledge

nomilk4y ago

Thanks for introducing me to the term "half-life of knowledge".

> An engineering degree went from having a half life of 35 years in ca. 1930 to about 10 years in 1960. A Delphi Poll showed that the half life of psychology as measured in 2016 ranged from 3.3 to 19 years depending on the specialty, with an average of a little over 7 years.

This is very interesting and makes me wonder what it is for tech careers, e.g. web devs, data scientists etc.

hallway_monitor4y ago

For javascript developers it's about six months!

Kidding aside, it seems you could estimate it by asking, what portion of the knowledge I use did I learn 20 years ago? Then 10, 5, 1. For me it seems to be somewhere around ten years.

larwent4y ago· 2 in thread

Unless I missed something, this article doesn't explain WHY random data can result in a Dunning-Kruger effect. The relationship between the "actual" and "perceived" score is a product of bounding the scores to 0-100.

When you generate a random "actual" score near the top, the random "perceived" score has a higher chance of being below the "actual" the numerical below is larger than the one above, and vice-versa. E.g. a "test subject" with an actual score of 80% has a (uniform random) 20% chance of overestimating their ability and an 80% of underestimating it. For an actual score of 20%, they have an 80% chance of overestimating.

once_inc4y ago

A person with an actual score of 80% will probably have enough confidence in his or her abilities due to experience that they will tend not to rate themselves low. Imagine being a graduate student asked how high (s)he would rank. They would not rank themselves as low as they might have when they were sophomore students. They would probably rank within 20% of their actual score, which is what the final graph in the article shows; professors have enough experience to be able to self-assess themselves better than less experienced subjects can.

ImaCake4y ago

As explained in the article, the reason is autocorrelation. Basically the y axis is correlated to the x axis because the y axis is actually x + random noise. The dunning kruger graph is then a transformation of that data - still subject to autocorrelation.

_dain_4y ago· 2 in thread

I don't find the "autocorrelation" explanation intuitive (although it may be equivalent to what I'm about to suggest). The way I think about it, is that it comes about because the y-axis is a percentile rank. How does it actually work for people to give unbiased estimates of their performance as percentiles? For the people at the 50th percentile in truth, they could give a symmetric range of 45-55 as their estimates, and it would be unbiased. But what about the people at the 99th percentile? They can't give a range of 94-104, the scale only goes as high as 100. So even if they are unbiased (whatever that means in this context), their range of estimates in percentile terms has to be asymmetrical, by construction. So, even if people are unbiased, if you were to plot true percentile vs subjective estimated percentile, the estimated scores would "pull toward" the centre. Then the only thing you need to replicate the Dunning-Kruger graph is to suppose that people have a uniform tendency to be overconfident, i.e. that people over-rate their abilities, but to an extent unrelated to their true level of skill. The estimated score at the left side of the graph goes higher, but it can't go as high on the right side of the graph because it butts up against the 100 percentile ceiling. Then you end up with a graph that looks like lower skilled people are more overconfident than higher skilled people are underconfident.

derbOac4y ago

It's an interesting article but the author is using terms a little incorrectly or strangely I think, and making untrue statements. The basic points are important and interesting to think about, but could've been explained more clearly.

_dain_4y ago

Yes, when I read "autocorrelation" I think of a time-series variable that is correlated with its own lagged values.

danbruc4y ago· 2 in thread

I am no expert in statistics or the Dunning-Kruger effect but this analysis doesn't sound correct to me. If you plot self assessment against test scores then the following will happen. If people are perfect at self assessment, then you get a straight diagonal line. The more wrong they are, the wider the line will get, in the extreme - if the self assessment is unrelated to the test result - the line will cover the entire chart. If people overestimate their performance, the line will move up, if they underestimate their performance, the line will move down. If you look at the Dunning Kruger chart, that is what you see, complicated a bit by the fact that they aggregated individual data points. At low test scores the self assessment is above the diagonal, at high test scores it is below. What matters is indeed the difference between the self assessment and the ideal diagonal, but if you don't plot individual data points but aggregate them, you have to make sure that there is a useful signal - if self assessments are random, then the median or average in each group will be 0.5 and you will get a horizontal line, but that aggregate 0.5 isn't really telling anything useful.

geysersam4y ago

I'm not sure what you mean with "the wider the line will get". But here is the issue:

The least competent person cannot underestimate their relative competency. Any not exactly accurate estimate they do is an overestimate.

Correspondingly, the most competent person cannot overestimate their relative competency.

This leads to the perception of bias where there is none, except a trivial tautological one.

danbruc4y ago

I made you a picture [1]. I randomly generated 100 test scores between 0 and 1, then different self assessments. Top left, self assessment matches actual score, top middle, self assessment varies uniformly by ±0.1 around the test score, top right, self assessment varies uniformly by ±0.2 around the test score. None of those have a Dunning-Kruger effect. If you aggregate data points, there will be - as you mentioned - an edge effect because the self assessment will get clipped.

In the bottom row I added a Dunning-Kruger effect, at a test score of 0.7 the self assessment is perfect, below and above that the self assessment is off by 0.5 times the distance of the test score from 0.7. Otherwise the bottom charts are the same, no random variation on the left, ±0.1 in the middle and ±0.2 on the right. You can see that the edge effect is less important as the data points are steered away from the corners.

I will admit that the original Dunning-Kruger chart could or could not show a real effect, really depends on how they aggregated the data and how noisy self assessments are. But if you have a raw data set like the one I generated, you could easily determine if there is an effect. If one could find such a data set, I would like to have a look.

[1] https://imgur.com/g4frW6p

1 more reply

titzer4y ago· 2 in thread

I find the article frustrating because of the tone. It's also wrong. They misunderstood what lines mean.

This article is absolutely dripping with condescension throughout and is really pushing a "gotcha" that doesn't exist. It then argues basic statistics, generates a DK-looking graph from random data, and then claims the phenomena doesn't exist. When in fact, as other people have commented, when people are bad at estimating their own ability (i.e. random), the DK effect still exists; it falls out of statistics.

Sigh, the author misunderstood the very definition of the DK effect:

> "The Dunning–Kruger effect is the cognitive bias whereby people with low ability at a task overestimate their ability. Some researchers also include in their definition the opposite effect for high performers: their tendency to underestimate their skills."

In all the examples, this holds, even if the assessment ability is totally random. Even if every quartile gives themself an average score, like the random data generated here. The author seems to think that it should be even more lopsided or something to demonstrate the effect. (I mean, honestly, what are they expecting, a line above 50th percentile? A line with negative slope? What?)

If there were no DK effect, the two lines would be the same.

Instead, if we go back and look at the original data, we see indeed, the two lines are not the same, the average for the bottom quantile is over 50%, there is some small increase in perceived ability associated with actual ability (and not the opposite).

The sin here isn't some autocorrelation gotcha, but rather, DK should have put error bars on the graph. If it was totally random, the error bars would be all over the place.

bena4y ago

The fact that you can generate a Dunning-Kruger looking graph using nothing but noise does indicate that the graph isn't proof of anything.

He also points out that the problem is that there's nothing below zero and nothing above 100. You can't have people who estimate beyond that. He uses another study and it turns out, the less knowledgeable you are about a skill, the worse you are at estimating your ability at all. In both directions.

If the lines were the absolute difference between perceived ability and actual ability, for no effect, the lines still shouldn't be the same. They should converge towards those who are knowledgeable. If anything, the difference line should be nearly a horizontal line. Because there should be greater variance in estimations at the lower end.

titzer4y ago

It would seem like using violin plots in these type of graphs would help a lot. If low-skilled people are bad at estimating, their variance (and distribution) will be a lot wider.

Dave_Rosenthal4y ago· 1 in thread

This was interesting to me so I spent a while this AM playing with a Python simulation of this effect. I used a simple process model of a normally-distributed underlying 'true skill' for participants, a test with questions of varying difficulty, some random noise in assessing whether the person would get the question right, noise in people's assessments of their own ability, etc.

I fiddled with number of test questions, amounts of variation in question difficulty, various coefficients, etc.

In none of my experiments did I add a bias on the skill axis.

My conclusion is that the "slope < 1" part of the DK effect (from their original graph) is very easy to reproduce as an artifact of the methodology. I could reproduce the rough slope of the DK quartiles graph with a variety of reasonable assumptions. (One simple intuition is that there is noise in the system but people are forced to estimate their percentiles between 0 and 100, meaning that it's impossible for the actual lowest-skill person to underestimate their skill. There are probably other effects too.)

However, I didn't find an easy way using my simulation to reproduce the "intercept is high" part of the DK effect to the extent present in the DK graphs, i.e. where the lowest quartile's average self-estimated percentile is >55%. (*)

However, it strikes me that without a very careful explanation to the test subjects of exactly how their peer group was selected, it's easy to imagine everyone being wrong in the same direction.

(*) EDIT: I found a way to raise the intercept quite a lot simply by modeling that people with lower skill have higher variance (but no bias!) in their own skill estimation. This model is supported by another paper the article references.

SomewhatLikely4y ago

Wouldn't variance be influenced by a similar bounding effect but this time from the upper side? That is, if your true skill is 98% you aren't going to ever overestimate by more than 2%, but if your true skill is 50% you could be off by up to 50% in either direction.

askasp4y ago· 1 in thread

If we assume random data then the people at the lower end will over-estimate their own performance the same amount that people on the higher end will under-estimate theirs.

However, if the under-performers consistently over-estimate more than the over-performers under-estimate there is still some merit to the effect, isn't there?

That is, the interesting number is the difference between integral of y-x on lower half vs the integral of y-x on the upper half. Does that make sense to anyone else?

m30474y ago

Yeah, I think so and we're probably in the minority here. There are a couple of other comments referring to regression to the mean and that the article takes a literalist view which is perhaps unwarranted. You win the followup comment. ;-)

I confess that I've never paid that much attention to the classic D-K graph, and that taking a close look at it, it is most assuredly crap. Now I want to know what the plots of the actual scores for those quartiles look like rather than %ile, or after-the-fact ranking. Yeah, it sure looks like people mostly figure they're in the 55-75 %ile ranking, if that's what that actually is, and that where in that spread they think they are correlates with their actual ranking.

Let's go down a Bayesian rabbit hole. Let's assume, as does the article, that people's self estimations are completely random rubbish: the worst people have nowhere to go but up, the best nowhere but down. Yup, completely agree.

Now let me ask a question: is self-estimation of any use in determining actual ability? The answer in this case is no: knowing one does not inform our ability to know the other in a Bayesian sense, they are not correlated.

D-K sounds valuable as a cautionary tale concerning excessive exuberance and a tendency not to learn well from experience, but aside from child-proof caps and Mr. Yuk stickers where we really want to apply the lesson is at the high-performing end of the scale and here we get into trouble immediately.

It is tempting to say "high-performers have nowhere to go but down" as though maybe we should reject those self-reporting the best performance. The classic chart hints at high performers underestimating their true performance, but it's a crappy chart; maybe they want it to be true.

But in the specific case where there is utterly no correlation and true performance is as evenly distributed as self-assessment, if we chop off the "top X self-reporting" we will chop off just as many poor performers as high performers. Yes, I hear you, and I agree, random is an edge case; I just don't believe that affects its prevalence.

Maybe it is true; alright dust off those priors and have at it.

fallingfrog4y ago· 1 in thread

OK, I think I understand. What the data from the original experiment actually shows is that people at all skill levels are pretty bad at estimating their skill level- it's just that if you scored well, the errors are likely to be underestimates, and if you scored badly, the errors are likely to be overestimates, by pure chance alone. So it's not that low scoring individuals are particularly overconfident so much as everyone is imperfect at guessing how well they did. Great observation.

civilized4y ago

So it seems... but I still don't understand why the author thinks it's helpful to say "autocorrelation" dozens of times when he could have just said this.

playpause4y ago· 1 in thread

I’ve always felt the DK effect is cynical pseudoscience for midwit egos. It’s a sophistic statement of the obvious dressed up as an insight. But worse, it serves to obvert something interesting and beautiful about humans - that even very intellectually challenged people sometimes can, over time, develop behaviours and strategies that nobody else would have thought of, and form a kind of background awareness of their shortcomings even if they aren’t equipped to verbalise them, allowing them to manage their differences and rise to challenges and social responsibilities that were assumed to be beyond their potential. Forrest Gump springs to mind as an albeit fictional example of the phenomenon I’m talking about. I think this is a far more interesting area than the vapid tautology known as the DK effect.

jollybean4y ago

I think there's an easier explanation for the effect, and that is people are just not very good at judging their skill level, and due to reversion to the mean, low-performers probably overestimate and high-performers underestimate.

And also, I think there is actually a tiny bit of DK going on.

And then, as you say, it gets amplified by the pseudo-literati.

roguecoder4y ago· 1 in thread

The point is that if people estimate their abilities at random, with no information, it will look like people who perform worse over-estimate their performance. But it isn't because people who are bad at a thing are any worse at estimating their performance than people who are good at the thing: they are both potentially equally bad at estimating their performance, and then one group got lucky and the other didn't.

It would require them to be _even worse that random_ for them to be worse at estimating their abilities, rather than simply being judged for being bad at the task. It is only human attribution bias that leads us to assume that people should already know whether they are good or bad at a task without needing to being told.

The study assumed that the results on the task are non-random, performance is objective, and that people should reasonably have been expected to have updated their uniform Bayesian priors before the study began.

If any of those are not true, we would still see the same correlation, but it wouldn't mean anything except that people shared a reasonable prior about their likely performance on the task.

People will nevertheless attribute "accurate" estimates to some kind of skill or ability, when the only thing that happened is that you lucked into scoring an average score. You could ask people how well they would do at predicting a coin flip and after the fact it would look like whoever guessed wrong over-estimated their "ability" and a person who guessed right under-estimated theirs, even though they were both exactly accurate.

This comment section clearly demonstrates the attribution bias that makes this myth appealing, though. And this blog post demonstrates how difficult it is to effectively explain the implications of Bayesian reasoning without using the concept.

roguecoder4y ago

Consider the original study: they used 45 Cornell undergraduate students and asked them about grammar. Grammar isn't objective. Everyone there had performed well on the verbal portion of the SAT, but they weren't studying grammar and hadn't gotten instruction on this particular book of grammar they were judged against. It is very likely that what they were capturing in the "better" or "worse" scores is differences in local dialect.

They then judged people whose beliefs about grammar varied from the one book's beliefs about grammar as having over-estimated their performance. They took people out of one context, asked them how they would behave in a novel context, and everyone made an educated guess. The people who guessed correctly were judged to accurately know their own abilities, when actually they may just have gotten lucky.

Thus what Dunning-Kruger's paper actually says is that if you want people to know how you would like them to perform a task, you can't assume they will read your mind: you have to provide them with actual feedback on their performance.

georgefox4y ago· 1 in thread

This is a fascinating discussion, to which I have little to add, except this. Quoting the article (including the footnote):

> [I]f you carefully craft random data so that it does not contain a Dunning-Kruger effect, you will still find the effect. The reason turns out to be embarrassingly simple: the Dunning-Kruger effect has nothing to do with human psychology[1].

> [1]: The Dunning-Kruger effect tells us nothing about the people it purports to measure. But it does tell us about the psychology of social scientists, who apparently struggle with statistics.

It seems to me that despite rudely criticizing a broad swath of academics for their lack of statistical prowess, the author here is himself guilty of a cardinal statistical sin: accepting the null hypothesis.

The fact that data resemble a random simulation in which no effect exists does not disprove the existence of such an effect. In traditional statistical language, we might say such an effect is not statistically significant, but that is different from saying that the effect is absolutely and completely the result of a statistical artifact.

The nuance of statistics is never-ending.

ImaCake4y ago

Later in the article the author points to an article which does systematically illustrate that the D-K effect is probably not real. They achieve this by using college education level as an independent proxy for test skill with the Y variable being an unrelated assessment of skill - self-assessment. So we can be pretty confident that the D-K effect is at least very small.

poulpy1234y ago· 1 in thread

So dunning and Kruger were victims of the dunning-kruger effect ?

js84y ago

The opposite - the publishing pressure makes experts overconfident in their abilities. The non-experts then assume that the experts, for sure, know better, so they don't look for flaws.

nathias4y ago· 1 in thread

Great article, there should be much more common knowledge of statistics and it's problems, it is surely the most abused of all sciences.

srvmshr4y ago

Hence the quote:

"There are white lies, damned lies and statistics"

Funny that all the major ML marvels are also built on statistical foundations - a tool used as much as abused.

diwank4y ago

Excerpt from a newer paper by Nuhfer (2017) adds more clarity:

“… Our data show that peoples' self-assessments of competence, in general, reflect a genuine competence that they can demonstrate. That finding contradicts the current consensus about the nature of self-assessment. Our results further confirm that experts are more proficient in self-assessing their abilities than novices and that women, in general, self-assess more accurately than men. The validity of interpretations of data depends strongly upon how carefully the researchers consider the numeracy that underlies graphical presentations and conclusions. Our results indicate that carefully measured self-assessments provide valid, measurable and valuable information about proficiency. …”

https://www.researchgate.net/publication/312107583_How_Rando...

sfvisser4y ago

My intuition for this is: given a fixed and known scoring range (say 0..100), when scoring very low there is simply a lot of room for overestimating yourself and when scoring very high there is simply a lot of room for underestimating yourself. So all noise ends up adding to the inverse correlation naturally.

orf4y ago

> To measure ‘skill’, Nuhfer groups individuals by their education level…

I’m surprised this wasn’t flagged as something pretty silly.

highfrequency4y ago

The author is onto something that Dunning-Kruger is suspicious, but the argument is wrong. The "statistical noise" plot actually demonstrates a very noteworthy conclusion: that Usain Bolt estimates his own 100m ability as the same as a random child's. This would be a great demonstration of the Dunning-Kruger effect, not a counterargument.

On the other hand, regression to the mean rather than autocorrelation does explain how you could get a spurious Dunning-Kruger effect. Say that 100 people all have some true skill level, and all undergo an assessment. Each person's score will be equal to their true skill level plus some random noise based on how they were performing that day or how the assessment's questions matched their knowledge. There will be a statistical effect where the people who did the worst on the test tend to be people with the most negative idiosyncratic noise term. Even if they have perfect self-knowledge about their true skill, they will tend to overestimate their score on this specific assessment.

Regression to the mean has broad relevance, and explains things like why we tend to be disappointed by the sequel to a great novel.

cryptica4y ago

I've felt inadequate throughout most of my early career. That's how I know that the confidence I have today is well deserved.

I've never had impostor syndrome though. To have impostor syndrome, you have to be given opportunities which are significantly above what you deserve.

I did get a few opportunities in my early career which were slightly above my capabilities but not enough to make me feel like an impostor. In the past few years, all opportunities I've been given have been below my capabilities. I know based on feedback from colleagues and others.

For example, when I apply for jobs, employers often ask me "You've worked on all these amazing, challenging projects, why do you want to work on our boring project?" It's difficult to explain to them that I just need the money... They must think that with a resume like mine I should be in very high demand or a millionaire who doesn't need to work.

I've worked for a successful e-learning startup, launched successful open source projects, worked for a YC-backed company, worked on a successful blockchain project. My resume looks excellent but it doesn't translate to opportunities for some reason.

oh_my_goodness4y ago

Dunning and Kruger showed that students all thought they were in roughly the 70th percentile, regardless of where they actually ranked. That's it. The plots in the original paper make that point very clear.

It is unnecessary to walk the reader through autocorrelation in order to achieve a poorer understanding of that simple result.

IncRnd4y ago

》It’s the (apparent) tendency for unskilled people to overestimate their competence.

Close. It's the cognitive bias where unskilled people greatly overestimate their own knowledge or competence in that domain relative to objective criteria or to the performance of their peers or of people in general.

jl27184y ago

So, they observe a bias toward the average, and the dependence goes exactly as one would naively expect. If scientists exist to explain things we find interesting, statisticians exist to make those things boring. Seriously, work as a data scientist and you end up busting hopes and dreams as a regular part of your job. Almost everything turns out to be mostly randomness. The famous introduction to a statistical mechanics textbook had me pondering this. If life really is just randomness, it’s hard to find motivation. From a different viewpoint, however, I’ve found that the people that embrace this concept by not trying to control things too much, actually end up with the most enviable results, although I may be guilty of selection bias in that sample.

edtechdev4y ago

There have already been responses to this criticism before, such as: https://drbenvincent.medium.com/the-dunning-kruger-effect-pr...

including from David Dunning himself https://thepsychologist.bps.org.uk/volume-35/april-2022/dunn...

sanp4y ago

Seems like a half-baked analysis. You would plot x=x to show where y is above and where it is below. It is useful for exposition. The author questions this as if it is an analytical oversight.

kizer4y ago

I’m not a scientist, but wouldn’t it make sense for standard practice to be to assume at first that there’s a shared variable (that you have introduced) and to look for it until you’re certain the things you’re plotting are independent? Of course they may not be in the end as that’s the “goal”, but the shared variable if there is indeed causation in that case will be what you’re looking for, not one of the variables you “know”.

newbamboo4y ago

Autocorrelation is much more interesting, and much more important topic than dk, which mostly seems to be popular concept because it supports biases and other fallacious, ego driven thinking. Autocorrelation is an under-appreciated problem, particularly in the social sciences and Econ. So it’s nice to use dk to catch the attention of the masses to spread the word about autocorrelation.

ncmncm4y ago

Most citations of D-K are themselves examples of D-K.

dgb234y ago

Tangential, but the more interesting question for me is:

How does estimating my skill level influence skill growth, social relationships and decision making?

I think there are a bunch of useful angles to this. When there are risk/responsibility opportunities, then I need to be courageous. When it’s about learning and interacting collaboratively, then I need to be humble.

crashingintoyou4y ago

Some other Dunning-Kruger critiques aggregated by Andrew Gelman: https://statmodeling.stat.columbia.edu/2021/10/12/can-the-du...

bandyaboot4y ago

My takeaway, which may be flawed, is that the DK effect really hasn’t been debunked it any fundamental way. It’s just that the effect is statistical rather than psychological. High skilled individuals are still more likely to underestimate their skill level while low skilled individuals are still more likely to overestimate theirs. It’s just that everyone is bad at estimating their skill level and high skilled individuals have more room to estimate below their actual, while low skilled individuals have more room to miss above.

Is my reasoning flawed in some way?

MrYellowP4y ago

And yet, very stupid people are too stupid to recognize that they're very stupid.

Not a single word in that blogpost changes anything about that.

mcguire4y ago

"Academic rank" is an awfully weird proxy for skill, though.

mattwilsonn8884y ago

I would have been more interested in seeing the raw data from the original Dunning-Kruger study reformatted to avoid auto-correlation. Maybe I've skipped over an important detail in my head, but I don't see why plotting perceived test score vs. actual test score would cause any problems; neither variable is in terms of the other.

The final study discussed is convincing as far as I thought. By using academic rank (Freshman, Sophomore, ...) they can plot the difference between difference in score and predicted score against rank without auto-correlation. Its just that using academic rank seems a possibly unreliable metric and an unnecessary complication - why not just use data about test scores and predictions of scores which already exists in a proper statistical interpretation?

trombonechamp4y ago

That is not what the term "autocorrelation" means. Autocorrelation is the correlation of a vector/function with a shifted copy of itself.

knorker4y ago

No it isn't.

If everyone responded that they are 50% skilled (or per this article, that it's randomly distributed), then we

1. See the same graph, and

2. Bad people overestimate, and good people underestimate

This article merely describes Dunning Kruger. Accidentally proves it mathematically, but thinks that it debunks it.

woah4y ago

Seems like Dunning and Kruger suffered from the Dunning-Kruger effect

tpoacher4y ago

DunningKruger.OtherDefinitions.append( article )

j / k navigate · click thread line to collapse

195 comments

130 comments · 43 top-level

andersource4y ago· 40 in thread

That being said, the author seems very confident in their conclusion, and from the comments seems to have read a lot of related analyses, so I might be missing something. ¯\_(ツ)_/¯

leto_ii4y ago

> a world in which people are very bad at estimating their own skill, therefore, statistically, people with lower skills tend to overestimate their skills, and experts tend to underestimate it.

Be careful here, the conclusion you drew doesn't actually follow.

> y - x ~ x, this is called the residual plot

Edit: note on heteroskedasticity

pdonis4y ago

> DK doesn't mean no correlation, it means inverse correlation.

3 more replies

uncomputation4y ago

> DK doesn't mean no correlation, it means inverse correlation

andersource4y ago

>> a world in which people are very bad at estimating their own skill, therefore, statistically, people with lower skills tend to overestimate their skills, and experts tend to underestimate it.

> Be careful here, the conclusion you drew doesn't actually follow.

>> y - x ~ x, this is called the residual plot

2 more replies

tacitusarc4y ago

How does that not follow? It's just regression to the mean.

2 more replies

alecbz4y ago

The author’s confidence is itself an indication that they’re more likely to be wrong.

Kidding. Well, half-kidding, I did kind of find the tone a bit biting and dismissive, especially towards one of the commenters that were pointing out exactly what you did.

andersource4y ago

If the results from DK were similar to the random data results, I'd agree. But the DK results do show some correlation between skill and self-assessment ability.

2 more replies

_dain_4y ago

This is very clear in the original DK paper, they specifically focus on the supposed metacognitive deficiencies of low-skill people.

The article argues that the graphs supposedly demonstrating this fact, can also be generated from a model that does not have this difference, i.e. where bias_lowskill == bias_highskill.

EDIT: My characterization of the article is not correct, see here[1] for a visualization of the point I'm trying to make.

[1] http://emilkirkegaard.dk/understanding_statistics/?app=Dunni...

nbernard4y ago

> The article argues that the graphs supposedly demonstrating this fact, can also be generated from a model that does not have this difference, i.e. where bias_lowskill == bias_highskill.

2 more replies

andersource4y ago

IceDane4y ago

I'm definitely not even close to a statistician, but I'm also having a hard time accepting this analysis.

But even ignoring personal experiences, I'm not convinced by the arguments either. I understand what they are saying, but I don't see how this disproves the DK effect.

But that figure still says that worse performers are then likely to overestimate their own ability, just as much as it says that better performers are bad at it.

The end result is still that people in the bottom quartiles are going to over-estimate their own ability.

I don't know, maybe this is way out in the weeds. Please school me.

motoboi4y ago

To correct correlate the two variables (assessment and actual-score) you need to correlate the actual data, not measures of its caracteristics (average being one of them).

The actual data is shown. Even by eye it’s possible to see no strong (or significant) correlation exists.

denton-scratch4y ago

/me not a statistician either.

cestith4y ago

1 more reply

BlueTemplar4y ago

You seem to be forgetting about under-estimation, so your conclusions don't follow from your premises ?

> most people are pretty bad at self-assessment, but skilled people are a bit less so

This is pretty much the conclusion of the article, except that it isn't the tautologic DK or figure 9 that shows it, but Nuhfer's figure 11.

bouncycastle4y ago

The problem is that they calculated each person’s ‘self-assessment error’ with the actual test score. This error is the difference between a person’s self assessment and their test score,

This is like comparing x - y to x, and if you do this, you will get a correlation no matter what.

1 more reply

js84y ago

andersource4y ago

This is my understanding as well.

uldos4y ago

Unskilled people are more random with their self assessment that skilled. It has nothing to do with unskilled people thinking that they know everything.

andersource4y ago

> Unskilled people are more random with their self assessment than skilled

This is a statistical claim, supported by the DK graph (but not the random data thought experiment from the article).

> It has nothing to do with unskilled people thinking that they know everything

3 more replies

kenjackson4y ago

usefulcat4y ago

Accujack4y ago

>the author seems very confident in their conclusion

However, even if you prove that their data is not evidence, that doesn't actually say anything about whether the effect exists or not, just that the D-K paper isn't evidence of such an effect.

hgomersall4y ago

This is it. You need to write down the model then perform the inference. With some rudimentary munging you can get a super simple linear model.

uoaei4y ago

andersource4y ago

1 more reply

hgomersall4y ago

Null means "that thing which I've decided to attach special significance to". There's in general no reason to prefer a null hypothesis over any other hypothesis.

1 more reply

bitshiftfaced4y ago

Check out Nuhfer et al 2016, who had a different explanation for why Dunning Kruger wasn't true.

Dunning Kruger effect: lower performers overestimate their ability, and higher performers underestimate their ability.

What they actually found was that higher performers tend to be better at self-assessment. Lower performers are less accurate, but in both directions (not just overconfident).

brnaftr3614y ago

Author includes reference to Nuhfer, related studies can be found here:

https://digitalcommons.usf.edu/numeracy/vol9/iss1/art4/ https://digitalcommons.usf.edu/numeracy/vol10/iss1/art4/

krick4y ago

andersource4y ago

1 more reply

pdonis4y ago

> I really don't see how it concludes that the DK effect is wrong based on the analysis.

Neither do I. Basically what the article actually shows is that these two statements are equivalent:

(1) People with low test scores tend to overpredict their test scores, while people with high test scores tend to underpredict their test scores.

(2) People's predictions of their test scores are uncorrelated (or more precisely very weakly correlated [1]) with their actual test scores.

Also, this statement in the description of the Nuhfer research doesn't make sense:

"What’s important here is that people’s ‘skill’ is measured independently from their test performance and self assessment."

Um, the test performance is the people's "skill". And in the original D-K research, it was "measured independently" from the people's self-assessment (their prediction of their test performance).

hgomersall4y ago

msrenee4y ago

Please feel free to tell me why my interpretation is wrong. I understand stats just well enough to get myself in trouble.

omnicognate4y ago

That's just one study of course, but it sounds like a much better designed one than the original and does constitute actual evidence that the Dunning-Kruger effect doesn't exist.

andersource4y ago

> The fact that the statistical artifact is seen in completely uncorrelated data is only shown as a demonstration that it is not itself evidence of the claimed effect

(Edited for accuracy).

1 more reply

ohwellhere4y ago

It depends on what one means by the "Dunning-Kruger effect."

I had the same impulse that the analysis did not disprove DK, but after sitting with it for overlong I agree with the analysis.

I think there are two competing DK effect definitions that are being conflated, one descriptive and one explanatory:

1. DK shows that less skilled people overestimate their ability, and highly skilled people underestimate it

2. DK shows that people's estimation of their ability is causally determined by their actual ability

I'm not a statistician or a psychologist.

longtimegoogler4y ago

a-dub4y ago

i immediately jump to think that replacing all the data with noise is a pretty good null hypothesis (at least for the analysis). is that not true?

rawgabbit4y ago

1 more reply

TimPC4y ago· 8 in thread

hungrygs4y ago

TimPC4y ago

derbOac4y ago

This all gets really murky quickly in practice because of what "low" and "high" competence means, and what constitutes the actual scope of expertise with reference to a particular scenario.

NaturalPhallacy4y ago

Classic example:

https://en.wikipedia.org/wiki/Beechcraft_Bonanza

sweezyjeezy4y ago

https://xkcd.com/793/

knorker4y ago

> The effect is statistical not psychological.

It is, though. This article says that if people are bad at estimating their skill (towards randomness / the midpoint), then bad people will overestimate, and good people will underestimate.

ImaCake4y ago

>If you adjust the experiment design to avoid introducing the auto-correlation you get data that doesn't show the DK effect at all.

This is even in the article! Yet some people are making claims against this despite references to the contrary.

doetoe4y ago

The article shows that the qualitative claim, that low scoring people tend to overestimate, and high scoring people tend to underestimate their ability, is nothing but a statistical obviety.

For the Dunning-Kruger effect to have psychological significance, you must quantify this, and show that they overestimate their abilities more resp. less than expected.

1 more reply

jakear4y ago· 6 in thread

Article seems to be saying “DK doesn’t exist because it always exists”. Which is… absurd?

larwent4y ago

ewzimm4y ago

I haven't reviewed these types of papers, but if nobody made even that basic adjustment in their analysis, how many others have been missed in experiments like this?

stevage4y ago

It would be possible to rule out that effect in an experiment.

1 more reply

erikmolin4y ago

The stasticial phenomenon exists, surely - but I think it will be very confusing for everyone to re-use the same name.

cortesoft4y ago

> In short, “there are more natural numbers above smaller natural numbers than bigger ones”

I get what you are trying to say, but this isn’t true… every natural number has the same amount of numbers greater and smaller… an infinite number.

jakear4y ago

Natural numbers are not the same as integers. They’re strictly positive.

(Now rate your confidence in making that assertion.)

1 more reply

PaulKeeble4y ago· 4 in thread

Phileosopher4y ago

I'm convinced it's associated with the methodology of how psychology has approached matters.

Psychology has the same nightmare as quantum physics: pre-existing thoughts and beliefs literally define what results you end up with.

BlueTemplar4y ago

Well, all sciences have this issue : see Kuhn, Feyerabend...

https://samzdat.com/2018/05/19/science-under-high-modernism/

It would be kind of ironic if psychologists were more susceptible to the "Pop-Baconian" simplification of science ?

And damning if what I've heard about psychology going through multiple paradigms during the 20th century alone is true.

But then, indeed, also understandable, due to the "softness" and "paradigmlessness" of the subject matter, as Kuhn had pointedout the later back then ?

It's still sad how we now have detailed theories and histories of science, but in practice scientists show no interest in trying to learn from them, nor the mistakes of their predecessors?

Bonus : Wiseman & Schlitz’s attempts to do an adversarial super-controlled parapsychological experiment : ( IV. )

https://slatestarcodex.com/2014/04/28/the-control-group-is-o...

aidenn04y ago

So, implicit in the standards for publishing new research in psychology is "We think there is much greater than a 5% chance that our entire field is wrong" which is not a great place to start from.

roguecoder4y ago

Maybe asking 45 Cornell undergraduates and then generalizing that to "people" was a bad idea after all.

ithkuil4y ago· 4 in thread

Given that fact, it logically follows that people who score low ability tests will more often than not have overestimated their ability (and the same on the other end of the spectrum).

You can frame this effect as autocorrelation if you wish or just as a logical consequence. But that's missing the point.

The point is: why on earth are humans so bad at estimating their own competence level as to make it practically indistinguishable from random guesses.

d0mine4y ago

ithkuil4y ago

Or is it?

This seems to imply that your incompetence indeed doesn't allow you to truly appreciate the full range of skills that are required to reach a higher level of competence.

In other words, the DK effect itself is the cause of the random distribution of the skill self-assessment (which in turn is the cause of the overestimation secondary effect)

kizer4y ago

But a comment above quoting the more recent paper presents a contradictory conclusion: that humans can self-assess with some accuracy. So now I’m confused again.

d0mine4y ago

You might have meant this comment https://news.ycombinator.com/item?id=31039901 (it says people can self-assess (no bias), more competent people do it better (less variance))

oceliker4y ago· 4 in thread

I think the gist of the article is this:

In reality, nobody had any clue how to estimate their own success. Yet we see the Dunning-Kruger effect in the plot.

andersource4y ago

That's the way I understand the statistical analysis, and in my view this exactly supports (not contradicts) DK:

> In reality, nobody had any clue how to estimate their own success.

Wouldn't that mean unskilled people tend to overestimate their skill, and experts tend to underestimate it? Why is there a contradiction with DK's conclusions?

oceliker4y ago

> Wouldn't that mean unskilled people tend to overestimate their skill, and experts tend to underestimate it?

I think it's because the original paper speculates far beyond it:

The argument about autocorrelation says this "dual burden" doesn't need to be there to observe the effect.

1 more reply

laszlokorte4y ago

fullshark4y ago

Yeah we learned people are bad at giving themselves percentile rankings apparently, especially when the population is illdefined ("your peers").

https://www.avaresearch.com/files/UnskilledAndUnawareOfIt.pd...

jtc3314y ago· 3 in thread

IMO that explains why Dunning Kruger seems intuitively correct even if the conclusion they drew isn't actually correct.

caylus4y ago

> people you encounter who wildly overestimate their ability are more likely to people who are poor performers

jtc3314y ago

It's an explanation for why our anecdotal intuition concludes what it does here.

I think you're misreading the point of my comment.

seventytwo4y ago

Agree. This should have been the original conclusion of DK, if they hadn’t made the mistake.

It’s not inconceivable that with a smaller sample, you’d get come biasing, where lesser skilled people would over estimate, and higher skilled people under estimate.

The follow up studies seem to suggest there’s not really a bias like that, but that there is a “honing” of the general ability to estimate your own outcome, which makes sense.

apienx4y ago· 2 in thread

Psychology's probably the discipline with the shortest "half-life of knowledge. https://en.wikipedia.org/wiki/Half-life_of_knowledge

nomilk4y ago

Thanks for introducing me to the term "half-life of knowledge".

This is very interesting and makes me wonder what it is for tech careers, e.g. web devs, data scientists etc.

hallway_monitor4y ago

For javascript developers it's about six months!

Kidding aside, it seems you could estimate it by asking, what portion of the knowledge I use did I learn 20 years ago? Then 10, 5, 1. For me it seems to be somewhere around ten years.

larwent4y ago· 2 in thread

once_inc4y ago

ImaCake4y ago

_dain_4y ago· 2 in thread

derbOac4y ago

_dain_4y ago

Yes, when I read "autocorrelation" I think of a time-series variable that is correlated with its own lagged values.

danbruc4y ago· 2 in thread

geysersam4y ago

I'm not sure what you mean with "the wider the line will get". But here is the issue:

The least competent person cannot underestimate their relative competency. Any not exactly accurate estimate they do is an overestimate.

Correspondingly, the most competent person cannot overestimate their relative competency.

This leads to the perception of bias where there is none, except a trivial tautological one.

danbruc4y ago

[1] https://imgur.com/g4frW6p

1 more reply

titzer4y ago· 2 in thread

I find the article frustrating because of the tone. It's also wrong. They misunderstood what lines mean.

Sigh, the author misunderstood the very definition of the DK effect:

If there were no DK effect, the two lines would be the same.

The sin here isn't some autocorrelation gotcha, but rather, DK should have put error bars on the graph. If it was totally random, the error bars would be all over the place.

bena4y ago

The fact that you can generate a Dunning-Kruger looking graph using nothing but noise does indicate that the graph isn't proof of anything.

titzer4y ago

It would seem like using violin plots in these type of graphs would help a lot. If low-skilled people are bad at estimating, their variance (and distribution) will be a lot wider.

Dave_Rosenthal4y ago· 1 in thread

I fiddled with number of test questions, amounts of variation in question difficulty, various coefficients, etc.

In none of my experiments did I add a bias on the skill axis.

However, it strikes me that without a very careful explanation to the test subjects of exactly how their peer group was selected, it's easy to imagine everyone being wrong in the same direction.

SomewhatLikely4y ago

askasp4y ago· 1 in thread

If we assume random data then the people at the lower end will over-estimate their own performance the same amount that people on the higher end will under-estimate theirs.

However, if the under-performers consistently over-estimate more than the over-performers under-estimate there is still some merit to the effect, isn't there?

That is, the interesting number is the difference between integral of y-x on lower half vs the integral of y-x on the upper half. Does that make sense to anyone else?

m30474y ago

Maybe it is true; alright dust off those priors and have at it.

fallingfrog4y ago· 1 in thread

civilized4y ago

So it seems... but I still don't understand why the author thinks it's helpful to say "autocorrelation" dozens of times when he could have just said this.

playpause4y ago· 1 in thread

jollybean4y ago

And also, I think there is actually a tiny bit of DK going on.

And then, as you say, it gets amplified by the pseudo-literati.

roguecoder4y ago· 1 in thread

If any of those are not true, we would still see the same correlation, but it wouldn't mean anything except that people shared a reasonable prior about their likely performance on the task.

roguecoder4y ago

georgefox4y ago· 1 in thread

This is a fascinating discussion, to which I have little to add, except this. Quoting the article (including the footnote):

> [1]: The Dunning-Kruger effect tells us nothing about the people it purports to measure. But it does tell us about the psychology of social scientists, who apparently struggle with statistics.

The nuance of statistics is never-ending.

ImaCake4y ago

poulpy1234y ago· 1 in thread

So dunning and Kruger were victims of the dunning-kruger effect ?

js84y ago

The opposite - the publishing pressure makes experts overconfident in their abilities. The non-experts then assume that the experts, for sure, know better, so they don't look for flaws.

nathias4y ago· 1 in thread

Great article, there should be much more common knowledge of statistics and it's problems, it is surely the most abused of all sciences.

srvmshr4y ago

Hence the quote:

"There are white lies, damned lies and statistics"

Funny that all the major ML marvels are also built on statistical foundations - a tool used as much as abused.

diwank4y ago

Excerpt from a newer paper by Nuhfer (2017) adds more clarity:

https://www.researchgate.net/publication/312107583_How_Rando...

sfvisser4y ago

orf4y ago

> To measure ‘skill’, Nuhfer groups individuals by their education level…

I’m surprised this wasn’t flagged as something pretty silly.

highfrequency4y ago

Regression to the mean has broad relevance, and explains things like why we tend to be disappointed by the sequel to a great novel.

cryptica4y ago

I've felt inadequate throughout most of my early career. That's how I know that the confidence I have today is well deserved.

I've never had impostor syndrome though. To have impostor syndrome, you have to be given opportunities which are significantly above what you deserve.

oh_my_goodness4y ago

It is unnecessary to walk the reader through autocorrelation in order to achieve a poorer understanding of that simple result.

IncRnd4y ago

》It’s the (apparent) tendency for unskilled people to overestimate their competence.

jl27184y ago

edtechdev4y ago

There have already been responses to this criticism before, such as: https://drbenvincent.medium.com/the-dunning-kruger-effect-pr...

including from David Dunning himself https://thepsychologist.bps.org.uk/volume-35/april-2022/dunn...

sanp4y ago

Seems like a half-baked analysis. You would plot x=x to show where y is above and where it is below. It is useful for exposition. The author questions this as if it is an analytical oversight.

kizer4y ago

newbamboo4y ago

ncmncm4y ago

Most citations of D-K are themselves examples of D-K.

dgb234y ago

Tangential, but the more interesting question for me is:

How does estimating my skill level influence skill growth, social relationships and decision making?

crashingintoyou4y ago

Some other Dunning-Kruger critiques aggregated by Andrew Gelman: https://statmodeling.stat.columbia.edu/2021/10/12/can-the-du...

bandyaboot4y ago

Is my reasoning flawed in some way?

MrYellowP4y ago

And yet, very stupid people are too stupid to recognize that they're very stupid.

Not a single word in that blogpost changes anything about that.

mcguire4y ago

"Academic rank" is an awfully weird proxy for skill, though.

mattwilsonn8884y ago

trombonechamp4y ago

That is not what the term "autocorrelation" means. Autocorrelation is the correlation of a vector/function with a shifted copy of itself.

knorker4y ago

No it isn't.

If everyone responded that they are 50% skilled (or per this article, that it's randomly distributed), then we

1. See the same graph, and

2. Bad people overestimate, and good people underestimate

This article merely describes Dunning Kruger. Accidentally proves it mathematically, but thinks that it debunks it.

woah4y ago

Seems like Dunning and Kruger suffered from the Dunning-Kruger effect

tpoacher4y ago

DunningKruger.OtherDefinitions.append( article )

j / k navigate · click thread line to collapse