On the other hand, these biaces (most notably the racial ones) exist in the process anyway, and now they're simply being codified and exposed. If these algorithms were published we could see exactly how much more punishment you get for being black in America versus being white.
Thanks again to ProPublica for an important piece of reporting; hopefully changes get made for the better.
http://www.scientificamerican.com/article/lunchtime-leniency...
And of course it goes without saying that judges will be affected by their biases, racial and otherwise.
I'm not sure what to do about it, though. Handing down the exact same punishment for every single person who commits a particular crime seems too blind. But any variation is going to be problematic.
Where there was absolutely no effect, one out of every twenty combinations of all other variables would show significance in combination with the current value of that particular variable in the likelihood of future crime.
Furthermore, the algorithm would simply extend existing biases in arrest and sentencing, because it simply can't account for crimes that are uncaught and unpunished. Groups that are stopped, searched, arrested, and convicted at greater rates would without fail be sentenced to more time. Just another benefit of being white in America.
You end up using the fact that some groups are punished more often to justify punishing them more harshly.
Even worse, I bet that the fact that it thinks that women are at a higher risk for recidivism means that somewhere within the algorithm it's using the fact that women in general are less criminal than men to decide that women who do commit crime are more exceptional (within women), and therefore more deviant. It's disgusting. If you can't legally discriminate against a person on particular grounds, you certainly can't feed those grounds into an algorithm to let it discriminate for you while you shrug and feign innocence.
The algorithm is the innocent one - it's just attempting to reflect the system as it is. It's like an algorithm you would write to predict the winners of horse races, or the sports book. And just like one of those algorithms, if you stuff it with garbage (the kind of garbage that makes it wrong 77% of the time), it will result in garbage. If you use the results for something not external to the system, bad variables will feed back into themselves and make the results progressively worse - what's the effect of a longer sentence on recidivism? How does profitable is the arbitrage on your sports book algorithm if people use the results to bet, and the distribution of bets shift the odds?
One thing is certain -- the federal government needs to shut these sentencing analysis companies down. At the very least heavy public audits. I'd say even libertarians would agree this is the definition of something that should be regulated.
"On Sunday, Northpointe gave ProPublica the basics of its future-crime formula — which includes factors such as education levels, and whether a defendant has a job. It did not share the specific calculations, which it said are proprietary."
How on earth can you lock people up based on secret information? That is Kafka meets Minority Report.
Variables used in the formula include details of the case, race/appearance of the defendant, and how recently lunch was at the time of sentencing. Unlike the ProPublica claims of racial bias (which are merely "almost statistically significant" at the p=0.05 level), the lunch bias is statistically significant at the p < 0.01 level.
http://www.pnas.org/content/108/17/6889.full
This system sounds like a huge improvement.
In particular, the cases are heard in a particular order. For each prison, the prisoners with counsel go before those who are representing themselves. As in the US, those representing themselves typically fair worse. The judges try to finish an entire prison's worth of hearings before a meal, so the least-likely-to-succeed cases are typically assigned to spots right before a break.
There are some other bits of weirdness in the original data too. They found a statistically significant association between the ordinal position (e.g., 1st, 2nd, ..., last) and the parole board's decision, but failed to find any effect of actual time elapsed (e.g., in minutes), even though the latter is much more compatible with a physiological hypothesis like running out of glucose.
Now if they were using decision trees, i.e. If the person has 3 or more felonies they get a 5 rating, that could be presented.
I'm curious about how much of a feedback loop this process has. The model was probably trained on old data and never updated. Also how does it take into account features that it doesn't know about (the article mentions one guy turning to Christianity)? I doubt if there is a mechanism for people to be asked why they did or did not reoffend. Even if they did how much should it be trusted?
I also worry greatly about diagnostic predictive models that maximise overall prediction success but don't balance the relative consequences of false positives and false negatives.
http://boingboing.net/2016/01/06/weapons-of-math-destruction...
It's easy to hide agenda behind an algorithm; especially when the details of the algorithm are not publicly visible.
https://github.com/propublica/compas-analysis/blob/master/Co...
In the statistical analysis (unlike the verbiage) she is completely unable to hide the lack of bias and the accuracy of the algorithm, all of which are clearly on display in line [36]. In contrast, her verbiage somehow conveys the exact opposite impression.
> Black defendants are 45% more likely than white defendants to receive a higher score correcting for the seriousness of their crime, previous arrests, and future criminal behavior.
> Women are 19.4% more likely than men to get a higher score.
> Most surprisingly, people under 25 are 2.5 times as likely to get a higher score as middle aged defendants.
> The violent score overpredicts recidivism for black defendants by 77.3% compared to white defendants.
> Defendands under 25 are 7.4 times as likely to get a higher score as middle aged defendants.
> [U]nder COMPAS black defendants are 91% more likely to get a higher score and not go on to commit more crimes than white defendants after two year.
> COMPAS scores misclassify white reoffenders as low risk at 70.4% more often than black reoffenders.
> Black defendants are twice as likely to be false positives for a Higher violent score than white defendants.
> White defendants are 63% more likely to get a lower score and commit another crime than Black defendants.
Calling out one specific section that doesn't show bias doesn't magically exonerate the rest.
Larson is still the second author so it is certainly a big question how he can present data showing no statistical correlation between race and score then have his name on an article saying the exact opposite that is clearly pushing an agenda. And as noted, one where the owners of the publication are also involved in a competing risk assessment product.
This article is terrible data journalism and probably deliberately misleading.
Step 1: write down conclusion.
Step 2: do analysis.
Step 3: if analysis doesn't support conclusion, write down a bunch of anecdotes.
Really, here's her R script: https://github.com/propublica/compas-analysis/blob/master/Co...
Just read that. It's vastly better than this nonsensical article.
> We obtained the risk scores assigned to more than 7,000 people arrested in Broward County, Florida, in 2013 and 2014 and checked to see how many were charged with new crimes over the next two years, the same benchmark used by the creators of the algorithm.
> The score proved remarkably unreliable in forecasting violent crime: Only 20 percent of the people predicted to commit violent crimes actually went on to do so.
> The formula was particularly likely to falsely flag black defendants as future criminals, wrongly labeling them this way at almost twice the rate as white defendants. White defendants were mislabeled as low risk more often than black defendants.
> Could this disparity be explained by defendants’ prior crimes or the type of crimes they were arrested for? No. We ran a statistical test that isolated the effect of race from criminal history and recidivism, as well as from defendants’ age and gender.
> Black defendants were still 77 percent more likely to be pegged as at higher risk of committing a future violent crime and 45 percent more likely to be predicted to commit a future crime of any kind.
https://github.com/propublica/compas-analysis/blob/master/Co...
Their own analysis shows that (p ~= 0) that high and medium risk factors are predictive. They also showed that the racial bias terms (race_factorAfrican-American:score_factorHigh, etc) are probably not predictive (p > 0.05).
Your quotes are not evidence of bias, though I see how they might confuse an innumerate reader. It's interesting how good a job this article is doing confusing the innumerate - it's almost as if it was written to mislead without technically lying.
For example, black defendants being pegged as being more likely to commit crimes can be caused by one of two things: bias or perhaps black defends actually are more likely to commit crimes. According to ProPublica's own analysis (see race_factorAfrican-American), the latter is actually the case. This is true with p = 4.52e-06 - see line [36].
Lets be clear -- if the null hypothesis in this case is true (that there is no bias), and all other assumptions made are true, there is a slightly greater than 5.7% chance of obtaining this result (or something even more skewed). That's a great bar for publication of SCIENCE. It's not a great bar for hiding behind a proprietary algorithm used in sentencing. People talk about misuse of p-values, but this takes the cake.
I'm confused though; the mood affiliation of your post somehow suggests that her less than perfect choice of a statistical methodology somehow supports her claims. Could you explain that? Or am I simply misunderstanding what you are trying to say?
Also, lets suppose we just take her own analysis at face value, and don't view it through the p-value lens. The maximum likelihood estimate suggests that even if this effect is not random chance, it's not very big. I.e., the "score factor high" estimate is >8x larger than the "score factor high, race = black" estimate. Isn't this really good? Do you really think the human biases that this algorithm mitigates are lower than this?
Lastly, what specific analysis would convince you that this algorithm is predictive and non-biased (or more realistically, not very biased)?
When I look at the 2 KM plots for white/blacks, they are mostly the same. It's pretty clear that the model is not prejudiced against blacks, in fact it's somewhat prejudiced against whites. [1]
Your main editorial claim is that whites tend to be misclassified as "good" and blacks as "bad."
But I think what's actually happening is that algorithm is more likely to misclassify low_risk as "good", and high_risk as "bad".[2] Combine that with vastly more whites than blacks being low_risk (as you show earlier) and you get the observed "injustice".
I'll also note that the KM for whites flatten out at 2 years, unlike for blacks. This is actually a big deal if statistically significant. But that's a separate conversation.
Footnotes:
1 - this is acknowledged in methodology page "black defendants who scored higher did recidivate slightly more often than white defendants (63 percent vs. 59 percent)."
2 - why that is I don't yet fully understand (and I'd like to) but it looks's to be simple math that follows from low risk mostly not recidivating, and high risk mostly yes recidivating
Does this sentence "Northpointe does offer a custom test for women, but it is not in use in Broward County. " imply that the base COMPAS model does not take gender into account?
In other words, if you are a repeat offender, some cases you think you know what your lawyer can do for you. But a system replacing that, one that is overly harsh may deter you. All things being equal in a system of punishment, I think I want the one that's got some deterrence in it. So this is worth exploring.
If every criminal knew that getting caught meant being put into a meat grinder of sorts, I wonder how that would change their thinking about how to navigate the world and problem solve.
https://github.com/propublica/compas-analysis/blob/master/Co...