There was some buzz Thursday about a poll showing that 40% of white people don’t have any friends of a different race. Ipsos/Reuters include a spiffy “data explorer” where you can make graphs like the one above. It does not appear to provide an easy way to get at the actual wording of the question, which is kind of crucial, and thus renders most of the stories about it too vague to take all that seriously.

Of course, this is somewhat reminiscent of the gender bias story from a couple of weeks ago, where it was shown that single-gender physics departments are not a clear indication of sexism. the actual distribution there turns out to be, if anything, slightly better than you would expect from unbiased random selection from the existing faculty population. The single-gender departments are inevitable given the small proportion of women in the physics faculty ranks. This doesn’t mean there isn’t gender bias, just that the distribution can’t really be used to show it.

Something similar is, at least in principle, in play here. After all, “people of color” and “minorities” are used somewhat interchangeably in media reports for a reason: 72% of the US population identifies as white. That’s only a little smaller than the gender split in physics, so the same sort of argument could apply– if you took as your null hypothesis that people make an unbiased selection of a limited number of friends from the general population at random, you would inevitably end up with a bunch of white people having only white friends, because of the unequal population numbers. In fact, if you want to interpret this as saying anything about race relations, you really *need* to do that comparison, in the same way that the AIP study of single-gender departments is necessary for studying gender bias. Also, it’s an excuse to do a playing-with-numbers blog post about basic math.

Of course, this is where the vagueness of the reports becomes a problem– without the exact wording of the question, there’s no way to do a reasonable comparison. But since lack of adequate information doesn’t stop news organizations from running useless stories, it shouldn’t stop us from doing quasi-statistical analysis of the numbers.

So, what can we do with almost no information? Well, the headline number for this story is the fraction of white people with no friends of another race, and that’s a nice thing to work with, because it lets us use the most trivial result from probability and statistics: if you have a particular random event that occurs with some probability *p*, then the probability of that thing happening the same way *N* times in a row is just:

$latex P(0) = p^N $

In this case, *N* would be the number of randomly-chosen friends, but we don’t know what that is. We do, however, know *P(0)*— it’s 40% for white people– and we know *p*, which is the 72% of the population identifying as white. So, we can solve for *N*.

This may seem like a hard thing to do, given that *N* only appears in the exponent, but we can use an old-school trick: take the logarithm of both sides. The logarithm of a number raised to a power is the log of that number multiplied by the power. In this case, we get:

$latex \ln (P(0)) = N \ln (p) $

which we can trivially solve for N, provided we can take logs (I did natural logs because I’m a physicist, and we like *e*. You could use base-10 logs if you’re an engineer, or base-2 if you’re a computer scientist, or base-n if you’re a masochist. It’s all the same in the end). And my cheap scientific calculator has a log button, so we’re good to go.

Plugging in the numbers, we find that:

$latex N = \frac{\ln 0.40}{\ln 0.72} \approx 2.8 $

So, you know, as long as white people only have 2.8 friends each, this result would be entirely consistent with random selection from the general population. Of course, given that the next most common answer for white people is “five or more” friends of different races (20% of whites), that’s probably not a great measure of the actual number of friends people have. But it makes for an eye-catching post title, so, by normal media standards, I can run with that. Right?

(This, by the way, is why the wording of the actual poll is critical. The survey seems to ask about “close friends,” and 2.8 would not be wildly out of line with some other studies about social interactions. But then, the numbers they report are not really consistent with 2.8, or with the other study. So they’re almost certainly using a different definition of “close friends,” but without knowing the definition, there’s no way to make sense of this.)

Somewhat more seriously, we could turn this around to ask a slightly different question, namely, what population distribution would you need to have to get this result from an unbiased random selection? Because, after all, the choosing of friends is a local interaction, and the population is not uniformly distributed. The national average of the population is 72% white, but that’s going to vary a lot depending on location.

If you pick a number of “close friends,” you can take the *N*^{th} root of the probability, and get a “local” value of *p* that you would need to have a random selection end up with that probability of having no friends of other races. That’s also a pretty trivial calculation to do, and if *N* is 10 (chosen more or less at random because it’s a nice round number greater than the “five or more” option that’s the other most common result), white people would need to be randomly drawing their friends from a population that’s 91% white. The same process can be used with the other numbers, as well– the overall “minorities” figure from the survey is about 25% reporting friends all of the same race, which would equate to drawing ten friends at random from a local population that’s 87% of whatever race. Even if you take the lowest single-race friend figure, the 17% value for Hispanics, you end up with an implied local population that’s 84% Hispanic.

(At first glance, those numbers may all seem too similar, given that the headline probabilities from the survey are so different. That’s because we’re looking at the tenth root of all those probabilities, though, and ten’s a reasonably big power. The Nth root of any number will converge toward 1 as N gets very large, something you can crudely demonstrate by punching a random number into your calculator and then hitting the square-root key repeatedly.)

Now, are those numbers more plausible than the “white people have 2.8 friends” figure? Pretty much, yes. If you go to something like the New York Times’s neighborhood explorer, it’s really easy to find whole counties that are better than 90% white, and not that hard to locate urban neighborhoods that are 80+% black or Hispanic. There are vast swathes of the country where the most likely explanation for people not having friends of a different race is simply that there aren’t a significant number of *people* of other races around.

Is that a problem? Well, I suppose you could concoct a scenario where it wasn’t, where that sort of population distribution came about for innocent reasons. In which case, it wouldn’t really be all that big a deal.

That, sadly, is not the nation in which we live. The distribution of population that we actually see is not the result of anything innocent, but is a legacy of a long and ugly history of racial discrimination pushing minority groups into less-desirable neighborhoods. Which means that the correlation between race and population also has a very significant correlation with wealth and political power, a correlation caused by very bad history. The numbers reported in this poll are problematic not so much because it’s a bad thing for people to not have friends of another race, but because they’re a proxy measurement of a deeper problem of social and economic stratification in American society. These numbers are further exacerbated by social class effects, as well– wealthy white people are not likely to spend significant socialization time with poor non-whites, even those in the same neighborhood– making it all too easy to have local neighborhoods that are effectively over 90% a single race.

And if you want a real take-home message from this survey, that’s probably it. The fact that 40% of white people don’t have any friends of a different race does not necessarily mean that all those white people are racist at the level of choosing who to be friends with. This is not to deny the existence of real racists and real racism, but as with the gender bias study, the mere fact that there are lots of people with monoracial friend groups does not unambiguously indicate that racist choices are being made. (Though a bunch of the headlines and Twitter links about the survey tend to create that impression.)

The real story is those population statistics, and the historical origin of those distributions. This survey may not point to racism at the level of choosing friends, but even under the most generous interpretation of personal choices, there’s racism at the level of where people live, due to the long-lasting effects of discrimination in housing, etc.. But you have to think a little more carefully about what’s going on to get there from the numbers that lead these stories.

(Of course, that’s not a particularly new story. Which is probably the best point in favor of stories about this survey: they’re a roundabout way of bringing an old problem up in a different context. Also, it’s an excuse to do a little math, and that’s never a bad thing…)

(If you wanted to get at whether this survey actually says anything about *attitudes*, you could look beyond the zero-friend figures, and try to see how the distribution of 1, 2, 3, and 4-friend numbers compares to random chance, and if you end up with local population numbers that are similar. But those numbers are all significantly smaller, meaning that the answers would be even more dubious than the above, and it would be more effort than would be worth it for a blog post.)

itâ€™s really easy to find whole counties that are better than 90% whiteThere are several states that are more than 90% white. I live in one of them. And often, the minorities that do live in such states will be clustered in certain regions, for a variety of reasons both good (Asian populations in and around this state’s university towns are significantly higher than most other towns, in part because many of them are students or employees of Local U.) and bad (the historic tendency in this country to force our First Peoples into hellholes like South Dakota’s Pine Ridge Reservation). That leaves many areas that are lily-white. A relative who lives about an hour away married somebody of a different race; her husband and children comprise a substantial fraction (at least 10%, and possibly as much as 100%) of that town’s nonwhite population.

An interesting article, and it begs a study into the general question:

“Do published statistical studies of racism in society indicate racism on the part of study authors?”

Or do the study on studies of sexism.

Or do experimental studies instead of statistical studies.

Perhaps go further and study the source of the -ism. Is it funding agencies, publish-ability, or what?

Probably a fairly easy subject to gather data on, since the subject published data. And a pretty topical subject.

“white people” is a racist term. The new terminology for them is caucasian American.

I am an Anglo-Saxon American. you racist people.