undefined | Better HN

0 pointsbboyrival5y ago0 comments

This is to measure the acceptance rate gap overall, not the gap for a specific host/guest/experience. Then, after changing the product, you can measure the gap post facto and see if you actually did anything. You can't change what you can't measure.

0 comments

3 comments · 1 top-level

AnthonyMouse5y ago· 2 in thread

But then what do you do with that if you don't understand the causes? It could be a systemic bias in your algorithm. It could be that the apparent disparity is just confounders you haven't accounted for. It could be that some of the hosts are overtly racist. It could be any combination of any of these.

You can just A/B test it with random changes until it goes away, but that doesn't mean you solved the problem, only that you made the metric a target.

If the problem is that some of the hosts are overtly racist, you could make the disparity go away by creating an algorithmic bias the other way, but that's only increasing the overall unfairness. The racists are still there harming the people they harm, then you introduce an additional harm to some entirely different innocent people in unrelated transactions. That doesn't help the people originally being harmed because the undue beneficiaries of your changes are completely different people who just happen to be the same race as the original victims, which makes the numbers balance even though overall unfairness has gone up rather than down. The equivalent of "solving" cops killing disproportionately many innocent black people by having the cops kill more innocent white people.

Meanwhile if the problem was algorithmic bias to begin with then you still have to understand the specific means by which the bias operates, otherwise you're susceptible to doing the same thing. Your algorithm was improperly disadvantaging Chris to the benefit of Chaz, you modify it to additionally improperly disadvantage Anna to the benefit of Alicia, and now your numbers balance because Chaz and Anna are the same race and the harms cancel out for your metric, even though they don't cancel out in real life for the people affected because they're different people.

And if the problem was that you weren't correctly accounting for confounders then there wasn't a real racial disparity there to begin with, and by making the un-adjusted numbers balance you created one.

You have to actually understand the causes and mechanisms before you can devise a solution. Just having aggregate statistics doesn't do it.

I'm inclined to think that it actually makes it worse, because then people only care about the statistics. But the statistics can show a disparity when everything is fine because it's just confounders, and the statistics can show balance when everything is not fine when you're making multiple errors in different directions that sum to zero in aggregate but not for the people affected.

bboyrivalOP5y ago

I recommend reading the technical paper if you're interested in the methodology seeing as it covers the things you're thinking about.

AnthonyMouse5y ago

No it doesn't. It talks a lot about how they're addressing anonymizing the data, which we know generally doesn't work anyway:

https://arstechnica.com/tech-policy/2009/09/your-secrets-liv...

Plus they essentially admit that anonymizing the data makes it less useful and their claimed solution is to use a larger sample size. Which is like offering "buy more fuel" as a solution to poor fuel economy. It's no solution to the efficiency loss (at any given sample size it's still worse) and you may not always be able to get a larger sample size.

None of which addresses the problem I identified anyway, which is related to identifying the cause of the disparity once one is discovered, which is already almost intractably hard even without anonymized data.

The much better solution is to identify specific instances of discrimination and address them (and the mechanism of discrimination they represent) regardless of what the statistics say because, again, aggregate statistics can both say that something is wrong when it's not and that nothing is wrong when it is, and the only way to tell is by looking at the individual cases.

j / k navigate · click thread line to collapse