In aggregate, in large data sets, race comes through - especially with a few datapoints. For example, when I worked at a fintech company: with household income and zip code, we could accurately target race with >80% accuracy [0]. Add a few more datapoints, and this would very quickly get closer to 95% accuracy.
That was an _actual_ party-trick[1] demo we did, alongside also de-anonymizing coworkers based on car model, zip code, and bank name.
[0] I worked as a SecEng and were trying to prove that we were(n't) inadvertently targeting race, for compliance reasons. In the end, the business realized the threat and made required changes to prevent this.
[1] We were doing this to make a case for stricter controls and stronger isolation/security measures for storing non-PII data. The business also saw the light on this. Sometimes we'd narrow them down to 30 or 40 people in their zip code, and sometimes (such as a coworker with an old Bentley), it was an instant hit.
(The program involved having children who were in regular contact with the criminal justice system.)
If the participants did represent a subset of the target audience, I don't really see what the problem is if that audience happens to be heavily weighted towards a particular race, sex, etc. It seems like you'd be doing a disservice to the program to purposely control for those factors and end up with a population that physically looks more diverse at the cost of missing people who actually most need the program.
But the thing you do care when you want to attribute causality. In part this is an issue because people naturally associate correlation with causation (there is good reason but that's a long discussion. See Judea Pearl's The Book of Why). At the end of the day, we really are always after causal relationships, because we want to do things with the data (somewhere along the chain). So it's not that you want to remove race from data, but rather that you want to be wary and ensure that your variable is not confounding the real issue. Though this happens outside of race too.
And note that at times there where race does play a causal role. (I suspect not likely in the parent's case) For example, different races may be more prone to certain illnesses or genetic disorders.
If it helps, maybe it is easier to frame it as it's easy to be lazy, but the pressure around race makes us more likely to revisit our analysis and look for confounding variables. The thing is, this will improve your stats even for the non-minority settings because the truth of what you're (hopefully) doing, is just making better models.
You're overconstraining what I've said. You're perfectly right that zipcodes in Appalachias account for many poor people that are also white. But actually, you're correctly inferring that you can still infer race out of this, because you're inferring that the majority of these zipcodes are also white. Right? White people are also a race. You're correct that zip code is also able to strongly indicate poor white people. In fact, it is also even able to strongly indicate rich black people. Though you might guess not to the same degree as the overall rate is lower, but people do congregate.
Think about it in a different framing: zipcode strongly correlates with people congregating together who are culturally and economically similar.
I think this version should make sense (especially as the locality affects the culture), and that from here you can extrapolate to recognize that people of varying demographics aren't homogeneously distributed among zipcodes of similar economic bins. I part of this is easily explained by a simple fact: when people move, they like to move to where they have friends, family, or other connections.