Take something like a bank loan. If you had a model at a bank which took credit score, income, wealth, and collateral into account, black Americans would have loans rejected at a higher rate than white Americans. Is this model racist? No, this model doesn't even know what race is, all it knows is credit scores, income, wealth, and collateral. Does the fact that black Americans used to be slaves in the US, or were kept out of certain housing markets, contribute towards the fact that black Americans, on average, have lower credit scores, income, wealth, and collateral? Of course. But is this model racist? Literally not at all. It is completely unbiased, and exactly what the model should be. If the case you're making is that you think that there should be a national effort to correct for historical injustices that were done by the state by actively discriminating by race, that is a completely different discussion.
Having all of our decision-making apparatuses factor in the infinite pile of historical injustices that may have contributed to an individual's particular circumstances is not the way to go. Keep models simple and limited to what is relevant for that particular criteria. Fix injustices further upstream, or you make the whole system a convoluted nightmare.
That is what proponents of the structural racism model are doing. Here's an example I took from the book Weapons of Math Destruction:
When people are convicted of a crime, they undergo a number of personality tests, including the LSI-R (Level of Service Inventory - Revised). This is a highly detailed questionnaire that asks about prior convictions, whether the prisoner had accomplices in their crimes, whether drugs or alcohol were involved, etc.
It does not ask about race.
What it does ask about are things which highly correlate with race, such as the number of police encounters (no criminal suspicion necessary), the number of friends/family/neighbours who have committed crimes, etc. If two first-time offenders have committed identical crimes but one of them grew up in wealthy suburbs and the other grew up in the rough inner city, they will receive very different scores on the LSI-R.
So what do they use the LSI-R for? They feed it into a model which assigns the offender a recidivism risk score. Then they use that risk factor directly when determining the person's sentence, restrictions, parole eligibility, etc.
So now we're not even talking about historical injustices, we're talking about ongoing injustice based on historical injustice. It's a vicious cycle, or a negative feedback loop, if you will. This is a serious problem!
Edit: Just to add another piece of the puzzle, the reason wealthy suburbs vs rough inner cities correlate so highly with race is a direct result of the historical racist practices of redlining [1] and white flight [2]. Now combine that with grinding poverty (also a result of redlining and segregation) and the war on drugs, and the result is high-crime neighbourhoods in the inner city. Those high crime neighbourhoods attract highly increased police presence, which leads to more convictions, which leads to more patrols, etc. This is another vicious cycle which feeds into the above statistical model.
Your indicator for whether or not a model is racist cannot simply be that the model produces outputs that are delineated by race in such a manner that is unpalatable. So long as the model is actually not using race as a means of predicting outcomes, though, any behavior that is racist would simply be due to including poor features.
"Crucially, incorporating more proximal and predictive variables into models, rather than relying on race variables to act as proxies, will improve transportability of algorithms across contexts."
If we want better models then they need to also model structural racism.
The article addresses this
> When “race-neutral” approaches are employed in model development, prediction will tend to be poorer for racial minority populations.... Two explanations for differentially poorer model performance can be addressed by collecting more data: too few observations of members of racial minority groups and unrepresentative sampling that can differentially limit generalizability. However, an additional cause of algorithmic bias is not well appreciated and cannot be overcome simply by adding more of the same kind of data to a learner....
But you also have to realize that this is always going to be somewhat arbitrary.
Statistical models as used in real world systems don’t have a concept of a “casual factor”. It literally doesn’t matter for the model why in certain zip codes, there is more property crime. It doesn’t care if it’s caused by poverty of residents, by pigmentation of their skin, by lead in the paint, or by the cultural traits of residents. All it cares about is the correlation: if the risk is higher, the insurance premiums go up too. For some it might seem unfair, and for some groups such statistical discrimination might be illegal (though not for all, eg. it’s perfectly legal to charge men higher insurance rates, which suggests that the moral principle here is not equality, but rather compensation for historical mistreatment), but without a doubt, from the model’s and business perspective, such reasoning is undoubtedly correct.
"A national effort to correct historical injustices" is one way but not the only. The people who create these models can refine the model or create others that determine acceptable business risks to provide loans to an under-served market.
What? In what world is this assumption being made? Do we assume that every person was born into a stable household? That every person has the same IQ, the same height, looks, had the same lucky encounters with the right people whose needs intersected with their capabilities? There are countless dimensions along which people are not the same, why would you ever assume that whether or not someone gets a bank loan has taken into account every advantage or disadvantage they have been given?
A bank loan is a business transaction where the likelihood of you being able to pay back the loan at the prescribed interest rate is being determined based off of highly predictive features, that's it.
If you want to do corrective social justice, do it in a handful of places, and let the rest of the system operate off of sensible rules. Social justice cannot permeate every single decision made in our society, it is irreducibly complex even on a single decision.
You should check out the disparate impact rule. If your involved in housing.
https://www.hud.gov/press/press_releases_media_advisories/HU...
That being said what is unfortunately NOT shocking, is that anyone upvoted this comment at all and it that doesn't have a negative score.
Pretending something doesn't exist (or ignoring the fact that it does exist), and modeling systems under that pretense, doesn't make the thing not exist - it only reinforces the existence of the thing.
I can understand the parent commenter downvoting, esp if not convinced by the position of the article or the subsequent comments.
I knew better than to wade into this topic on this site given the audience demo of this site, I am hardly surprised at the reaction by lurkers.
I am encouraged and heartened by the commenters that actually read the article and have provided excellent reasons why you would want to build systems and models that account for systemic racism and bias.
Very illuminating all around.
Isn't the argument being made in various different place that race is there in the data regardless of whether or not you encode it as a feature because the humans that create the data already used race as part of creating the data for the model. And by creating the data, I mean the interactions in real life that create the inputs.
You can't escape it because it's already in the inputs to the model because it's a rather insidious part of our society.
Generally speaking..
Ignoring or dismissing relevance of race is privilege for those in the majority who can trivialize something that doesn't apply to them. Apply that to models too and see how much they miss. Value can be where others might not.
It's kind of like making a decision that if one group hasn't had an experience of race, it doesn't mean anyone else could have either. It also signals that if they don't see value or understanding in it, therefore there can't be any in it.
It takes a truly open mind to entertain any viewpoint that isn't immediately their own.
The reality is, many people live with a reality that might seem unimaginable to viewpoints like the above.
I practice each day as if humanity is one family. I go out of my way to talk to every kind of person that doesn't look like me. It doesn't mean strangers talk to me, especially from the majority..
When programmers all think and look the same and grew up in the same way and places, software tends catch fewer edge cases of everyone who doesn't look like them, in areas of computer vision.
If we step beyond CV, and look into hilarious things like automatic motion sensor sinks only detecting certain shades (or lack thereof) of skin.
Just some food for thought, happy to chat offline too :)
Upstream may think the model--which to them looks like a black box--is perfectly rational, optimally profitable and socially beneficial in a way that is it is not. We have numerous examples where a computer has driven a decision and humans carried out its orders in ways that harmed people. Remember the man violently dragged off the United Airlines flight in 2017[1]? ICE justifies detention by tweaking their risk management software to always recommend detention[2].
This is why we need to care as people building the systems that make these decisions.
1. https://www.nbcnews.com/storyline/airplane-mode/united-fiasc...
2. https://www.vice.com/en_us/article/evk3kw/ice-modified-its-r...
Saying that because history was racist means I'm absolved of responsibility going forward is not a strong argument. Redlining was literally a racist behavior. The point of the article is to be aware of it so you can try do better than in the past.
Basically, African Americans exhibit much higher heart rate variability, meaning their nervous system is much quicker to react to stimuli (quicker time to fight or flight response, for example) and this still isn’t well understood in the field.
A naive understanding is that racial physiology is just different. And plenty of people will stand by this. However, self reported stress scores offer some insight into the difference.
High stress African Americans with High HRV lived as long as Low stress, low HRV White/Asian Americans. Most likely, the process by which the nervous system regulates itself is heavily influenced by life course events.
Medical science, in my experience, lacks in quantifying these social factors, and too often underplays their significance in determining physiological differences. Humans are incredibly dynamic systems, and the case can be made that we adapt to stimuli in order to survive. It’s certainly possible that the physiological difference we observe in different racial populations is due to survival based on this principal.
It’s only recently that I’ve seen research trying to get at these social/physiological mechanisms, but as far as funding is concerned, hard biological sciences are more interesting. Everyone just wants to edit the genome and call it a day, but I think we could get much further if we understood how life events lead to physiological ailments later in life.
Putting your head in the sand and trying to deny the existence of potentially uncomfortable facts actually fuels these fringe thinkers more imo. Part of their whole schtick is that the truth is being hidden from them.
Look at how the media handled the claim that Serena Williams couldn't beat a top 10 male player. Instead of actually putting it to the test, the whole angle was about how insulting and preposterous that was etc etc.
We are not all the same, but we deserve to be treated so. It's as simple as that. Trying to halt scientific progress because it doesn't fit your world view is quasi religious.
>quicker time to fight or flight response
If cops are more likely to stop you because of your skin color (also further increasing the chances of further abuses), that would probably have an impact on your physiology over time.
https://www.nationalgeographic.com/magazine/2018/04/the-stop...
If you live in a neighborhood where gun violence is a common occurrence, do you still get startled when you hear gunshots? If so, this would have a physiological effect, because it breaks your rhythm in an abrupt way, and does so at the frequency of hearing gun shots.
Obviously this is highly theoretical, but if true it would mean that just being in the proximity of violence puts you at an extra health risk, one that very few people would assume.
Hacker News at its finest ;)
It is not an interesting result to say models not modeling reality are less accurate; the cogent discussion is to what degree systemic racism exists IN reality. This is textbook begging the question.
> Acknowledgements: Conflict of Interest: None declared.
> Funding: Whitney R. Robinson is supported by the National Institute of Minority Health
I am not familiar with standards of conflict declaration, but this looks like a pretty clear conflict of interest to me.
https://www.npr.org/sections/goatsandsoda/2017/05/28/5302041...
1. If you have two people with identical relevant behavior and different races, you want the model to score them identically.
2. Each race should receive a comparable distribution of scores.
3. The scores should be as accurate of a predictor of ground facts as possible.
Relax the first desiderata and your model is now either explicitly or implicitly (via irrelevant proxy variables) using race to determine results, opening you up to racial discrimination lawsuits. Relax the second desiderata and your model is now creating disparate impact across racial groups, opening you up to racial discrimination lawsuit. Relax the third and you're leaving accuracy, and thus money, on the table.
In an article about structural racism, I would have expected more here.
It is these cultural differences which cause most of the group disparity in America, not "structural racism", yet the "critical race theorists" (race hustlers and grievance mongers) and their followers ignore these major factors and replace them with straw men.
In order to fix a problem, it's important to understand the actual causes. The sociologists and other assorted race hustlers will only divide us and lead us astray.
> Structural racism refers to “the totality of ways in which societies foster [racial] discrimination, via mutually reinforcing [inequitable] systems...(e.g., in housing, education, employment, earnings, benefits, credit, media, health care, criminal justice, etc.) that in turn reinforce discriminatory beliefs, values, and distribution of resources,” reflected in history, culture, and interconnected institutions (Bailey and others, 2017).
I think I might be misunderstanding, but given this includes “culture”, is this so sufficiently broad such that hypothetical scenarios such as this (no idea if this is accurate) would be captured “white people are culturally more likely to use crystal meth than other racial groups, ergo they are victims of (a certain kind of) systemic racism”?
It seems like this is just a catchall for any kind of error associated with a racial group, and the article is merely cautioning against such errors. If so, it begs the questions “why not just say so?” and “why use such a loaded term like systemic racism?”.
Suppose you are designing a facial recognition system for police to use in the field while investigating a recent crime to see if anyone with a criminal history is nearby.
(Data taken from: https://en.wikipedia.org/wiki/Incarceration_in_the_United_St...) Because blacks are over-represented in the US criminal justice system (40% of the prison population vs 13% of the population) and because part of what defines "black" is the outward appearance of certain facial features, a facial-recognition algorithm which is trained to recognize criminals, with a cost function based on prediction accuracy alone, and facial features as input parameters is likely going to have false positives that over-represent blacks.
It's very important to consider this when you develop a training set. The developer error (who mostly failed to understand Baye's theorem here) might work something like this: They take 100 innocent people's faces at random. (On average it will have only 13 blacks) Then take 100 random criminal faces from inmates. (On average it will have 40 blacks.)
Then mix up the groups into your training set and assign a prediction score 1 or 0 depending on whether or not your classifier has correctly predicted whether or not a face was in the criminal group. Then, based on no other feature than race, your neural net can get better performance based solely on guessing more often that black people are criminals. That's not a good thing. In fact, if it's looking at a black face from its training set, the odds are nearly 2 to 1 that it's one of the criminals, even though the odds that are at least 2 to 1 against a random black person having a criminal history.
The likelihood of being falsely identified as having a criminal history is much greater based on the only variable of being black. And this type of thing has happened several times already in production systems!
Conversely, the same system, trained on the same data in the same training set, can get higher performance than random by simply guessing that any non-hispanic white person does NOT have a criminal history.
Thus, it's pretty important to correct your training set to reflect the correct Bayesian prior, and the underlying structures that sometimes go by the label "structural racism" or "institutional racism" are essentially exactly that reality in this case.
To be honest, I don't think our understanding of these systems is mature enough for us to be just throwing them out into our societal systems right now. There needs to be a lot more testing because the possibility of unethical results is pretty damn high.
That said, I still think that the concept of structural racism is a bad way to look at this problem, as it's simply one form of a common error when looking at sample sets.
It's now leaking into other fields. I remember when I first heard stem be changed to steam to include "arts" a laughed at how inclusive and utterly useless it is.
Also, I think it would marvel you knowing how much things you would call "scientific study" are also heavily politicized.
If this bothers you to a big extent I would recommend you try to find comfort in thinking about postmodern sociology as a religion different than yours. They won't be bothered by it and it will probably fit your mindset in a more soothing way than thinking about them as scientists. It's not that they are trying to publish their findings in ACM TOPLAS or something, they have their own community and books and kinda like it.
Take the “Sokal 2.0” affair, where some profs sent obviously bogus research to various journals. Notably, while they were able to get “rape culture among dogs at the dog park” (or maybe it was racism, easy to look up) published in a gender studies journal, they couldn’t get published in sociology journals. Sociology has standards. The absurdities committed by gender studies as an institution don’t falsify racism/other forms of oppression. They might be wrong, but you’re certainly not right.