What If Algorithms Could Be Fair? (opens in new tab)

(humanreadablemag.com)

19 pointspekalicious6y ago30 comments

30 comments

I'm not sure I follow their car crash diagram and explanation. They've laid out that one ethnicity might prefer red cars more than others, and drivers of red cars tend to get into more crashes, and that training ML with "red cars" as a feature would lead to a bias against that ethnicity. I got that part. What I don't get is how the creation of the "risky behavior" node can be assumed to have a completely uniform distribution of ethnicities inside of it. The author has no problem saying that an ethnicity can have one causal behavior (purchasing red cars) but not another (being riskier drivers). This seems logically inconsistent.

bitL6y ago

There is a strong push for "fairness", see e.g. "Toronto Declaration". I think all it would do is completely halt progress of AI and install bureaucracy to the lowest decision levels, paralyzing whole ML research. Nobody seems to think that we are in a clash of different cultures with different sensitivities and there is no single common platform for stating what is "fair". I am worried the loudest voice would set the trend and we will have some insanity enforced all the way down. There are even calls to ban "blackbox" ML, basically allowing only trivial parts in any kind of decision making.

If members of my nation get drunk more often than some other, while it's offensive to say I am a 34% drunkard, on average it might hold; instead of forbidding this type of inference I'd rather rely on more signals to figure out what kind of person I am specifically for individualized decisions. They bypass this problem by adding "risky behavior" not contained in the input dataset so they just decide to model it as a hidden variable of Bayesian inference, where "risky behavior" might be correlated with ethnicity and red car anyway, just not visible outside. So if my nation is 34% drunkard but neighboring is only 11%, the conditional probability will likely be higher for my nation anyway, but obfuscated by the use of Bayesian hidden state. I am not sure why would that improve fairness.

barry-cotter6y ago

> There is a strong push for "fairness", see e.g. "Toronto Declaration". I think all it would do is completely halt progress of AI and install bureaucracy to the lowest decision levels, paralyzing whole ML research.

It would only paralyze those who paid attention to the Toronto Declaration. You’re right because you can’t make ML fair because the universe isn’t fair, that’s a property of human judgements about facts. The facts remain the same regardless of ones feelings.

https://www.chrisstucchio.com/pubs/slides/crunchconf_2018/sl...

AI Ethics, Impossibility Theorems and Tradeoffs

1 more reply

taneq6y ago

> there is no single common platform for stating what is "fair".

This is the crux of the issue and as always, most people seem to miss it. Often “fair” is used as shorthand for “does what I think is right”.

perl4ever6y ago

"forbidding this type of inference"

Isn't this just a misleading way to say "holding a certain causal belief"? Why exactly would that be a bad thing? If you reject one set of causal beliefs, you necessarily hold a different set.

1 more reply

AnthonyMouse6y ago

> What I don't get is how the creation of the "risky behavior" node can be assumed to have a completely uniform distribution of ethnicities inside of it.

It's a much broader problem than that, because the direction of causation can be extraordinarily difficult to establish in general.

Changing the color of your car shouldn't change your ethnicity, but what if it does? Suppose you're white with Spanish ancestry and Hispanics are the group who like red cars. Paint your car red and some red-car-preferring Hispanics may be more inclined to associate with you and thereby cause you to be more immersed in Hispanic culture and start to identify as Hispanic rather than white.

And that's a silly one just to show that even the exemplar could be wrong. More plausibly, what if the causation between "risky behavior" and "red car" is reversed? We know that colors can affect human behavior. If getting into a red car makes you drive more aggressively then you have a direct causal chain between being more likely to buy a red car (for any reason) and being more likely to drive aggressively and get into a car crash.

That means that in order to use this you would first need to prove the direction of causation between the two behaviors. But that's a tall hill to climb when one of the factors you're trying to prove causation with is the one you don't have good data on.

There is also a straight forward way to tell when a method like this is definitely getting the math wrong -- does it make the prediction rate for that class of people worse? If your assumptions are correct then it shouldn't, so if it does then you've unambiguously failed.

catalogia6y ago

Right, it seems very plausible that car culture differs in different cultures. Is it truly unreasonable to suggest that perhaps more than an average number of Italians are fast aggressive drivers? From what I've heard and seen, it's anecdotally true. I wouldn't rule out the possibility of it being a statistically true.

And every time I express my desire for autobahns without speed restrictions to crisscross North America, whoever I'm talking to has generally been quick to inform me that Germans can have nice things like that because they are careful/skilled drivers, while Americans are reckless (wreckful) drivers and cannot be trusted at high speeds.

perl4ever6y ago

If more than an average number of Italians are fast drivers, it doesn't mean being Italian causes being a fast driver. Is the idea that correlation is not causation in this context really breaking everybody's brain?

Now you may argue that correlation reflects causation in a particular case, sure, but in general, it is not the same, so it seems perfectly logical to me to point out that you can start building your model with certain causal assumptions and without others, without in any way disregarding your statistics.

1 more reply

perl4ever6y ago

I think the two behaviors should be understood as arbitrary for illustrative purposes. The point is, as I understand it, that you can decide that one causal relationship exists and another does not, and derive a model consistent with that and with the observed statistics.

Because, as people give lip service to constantly, but never seem to really adhere to, correlation is different from causation.

speedplane6y ago

Facebook developed it's "Look Alike" platform, to advertise things to people who "looked like" their current followers. Then they deployed this to hiring, home loans, and housing. The "algorithm" here just amplified whatever biases the company had to begin with. It's pretty unbelievable that Facebook did not recognize this was a problem until they were sued over it.

Making a system fair at the very least requires people designing the system to be fair. It's pretty clear that still does not happen, so I'm pretty skeptical of those that claim it's just around the corner.

remote_phone6y ago

Algorithms are based on statistics, or essentially stereotypes. The concept of fairness is something that can’t be adequately injected into an algorithm because it completely depends on what “fair” means and how that changes over time. What is “fair” now won’t be fair in 10 years.

It used to be considered fair to let people smoke when they wanted. Then it was considered fair to have smoking sections and non—smoking sections in restaurants. Now it’s considered fair to ban smoking entirely in restaurants and most public places.

throwaway728736y ago

If the algorithm charges higher car insurances premium to men, does it mean it is fair?

perl4ever6y ago

I think the whole point is that what causal relationships you assume matter, and they do not have to be derived from correlations. And they should not, in order to be "fair".

You have a choice of whether or not you believe being male causes car insurance claims. That is independent of the statistical correlations. Ten times a day people say correlation is not causation, but a hundred times a day, I see people implicitly insisting that it necessarily is.

TeMPOraL6y ago

It's not that people think correlation implies causation, as much as in many practical models, it's correlations you care about, not causation.

If I'm running an insurance agency and not a public policy advocacy, and my data keeps showing that men have higher accident rate than women, I can just ignore causation and build my actuarial tables based on that. I don't need a casual model here, at least not until the point I'd want to optimize my models further still, but there are diminishing returns on that.

1 more reply

doctorpangloss6y ago

In a world where most engineers just click through EULAs; don't bother to read the source code of the library they just imported; don't measure the performance of their application before it is deployed; don't run tests after installation; don't author tests; don't test their assumptions, etc. etc., it stands to reason that if an algorithm charges higher car insurance premiums, it may be for totally bullshit reasons totally obscured by some jagoff's "ML" code.

The reason fairness has so much headway among engineers isn't just an aversion towards discrimination among educated people. It's that we all know this stuff is way jankier than we care to ever admit, and that we'd never want to be the data sausage going through the algorithm grinder.

lainga6y ago

Are there no fair algorithms? I urge the authors to at least give Bogosort another chance!

1 more reply

j / k navigate · click thread line to collapse

30 comments

daenz6y ago

bitL6y ago

barry-cotter6y ago

https://www.chrisstucchio.com/pubs/slides/crunchconf_2018/sl...

AI Ethics, Impossibility Theorems and Tradeoffs

1 more reply

taneq6y ago

> there is no single common platform for stating what is "fair".

This is the crux of the issue and as always, most people seem to miss it. Often “fair” is used as shorthand for “does what I think is right”.

perl4ever6y ago

"forbidding this type of inference"

Isn't this just a misleading way to say "holding a certain causal belief"? Why exactly would that be a bad thing? If you reject one set of causal beliefs, you necessarily hold a different set.

1 more reply

AnthonyMouse6y ago

> What I don't get is how the creation of the "risky behavior" node can be assumed to have a completely uniform distribution of ethnicities inside of it.

It's a much broader problem than that, because the direction of causation can be extraordinarily difficult to establish in general.

catalogia6y ago

perl4ever6y ago

1 more reply

perl4ever6y ago

Because, as people give lip service to constantly, but never seem to really adhere to, correlation is different from causation.

speedplane6y ago

remote_phone6y ago

throwaway728736y ago

If the algorithm charges higher car insurances premium to men, does it mean it is fair?

perl4ever6y ago

I think the whole point is that what causal relationships you assume matter, and they do not have to be derived from correlations. And they should not, in order to be "fair".

TeMPOraL6y ago

It's not that people think correlation implies causation, as much as in many practical models, it's correlations you care about, not causation.

1 more reply

doctorpangloss6y ago

lainga6y ago

Are there no fair algorithms? I urge the authors to at least give Bogosort another chance!

1 more reply

j / k navigate · click thread line to collapse