Race and Covid-19


Gordon Shotwell


May 23, 2020

In a recent interview, Linda Villarosa outlines the three major causes that she and other public health researchers have identified as causes for the huge racial gap in Covid mortality:

1) Proximity to the virus

Black people live and work in environments where the virus is difficult to escape. They are more likely to work in essential services where it is difficult to engage in social distancing, and they are more likely to live in inter-generational homes in densely populated areas. All of this leads to more contact with the virus; more contact leads to more infection, and more infection leads to more death.

2) Racial bias in healthcare

Black people receive worse healthcare than white people. Black patients are less likely to be believed by physicians, essential interventions are more likely to be delayed, and resources tend to be allocated to white patients. Naturally, if you get worse healthcare because of your skin colour, your outcomes are going to be worse.

3) Pre-existing conditions

For a variety of reasons, Black Americans have higher rates of conditions that make Covid-19 more dangerous. Rates of diabetes, hypertension, and asthma are all higher in Black communities, which leads to higher Covid-19 mortality.

It’s helpful to draw this model as a graph to show the causal relationship:

What bothers me about this this model is that it leads to a pretty fatalistic perspective on Covid mortality. Most of the damage from the Covid pandemic is going to take place over the next 1-2 years, and it’s hard to imagine any of these systemic factors changing significantly during that time period. We’re probably not going to make major changes to our housing or economy such that Black people are not disproportionately proximate to the virus, and it’s unlikely that rates of pre-existing conditions can be modified in a year. It’s possible that bias in healthcare can be remedied, but I’m unaware of any project that has improved unconscious racial bias at such a large scale in such a short period of time. The result is that if you think that this is a complete model of excess Covid mortality, it seems likely that excess mortality is already baked into the system, and while tragic it’s not treatable. Even if these are major causes, they’re not causes that we can realistically modify over the next two years.

I think this is pretty dangerous because it leads researchers to effectively triage Black communities. If we’re certain that these are the only factors, and we’re also fairly certain that they can’t be changed within the necessary time span, what’s the point of studying the matter? Wouldn’t research energy be better spent reducing the overall incidence of the virus, rather than looking at relative differences between groups?

The fact is that we don’t know that much about Covid-19, so it’s possible that this isn’t a complete causal model and there are things that we can change to reduce excess mortality among people of colour. If we have too much certainty that this excess mortality is socially determined,
then researchers won’t invest time or money trying to figure out things that can be treated.

The vitamin D model

I’ve been very interested in vitamin D deficiency as a potential cause of Covid-19 mortality because it’s one of the best treatment candidates for widespread use. Unlike other potential therapies, we already have a lot of vitamin D clinical research data and we know that it’s safe to consume prophylactically. Vitamin D would also go a long way to explain why people of colour are dying at such high rates in the northern hemisphere where sunlight is currently scarce, but not in equatorial or southern parts of the world where sunlight is currently abundant. While skin colour is not the only factor that changes serum vitamin D, and race is a social category that’s only loosely related to skin colour, it is the case that in most northern latitudes people with darker skin (tend to have lower vitamin D)[https://nutrition.bmj.com/content/early/2020/05/20/bmjnph-2020-000096#ref-13] levels in the winter.

It’s important to be clear about the causal model I’m proposing. I am not arguing for a model like this one where vitamin D is the only relevant factor:

Instead, the vitamin D model adds vitamin D to all the other stuff that causes excess mortality. It might look like this:

What does the data say?

Of course none of this matters if it turns out that vitamin D is not a factor. So far there have been two main strategies that point in the direction that vitamin D, or some other unidentified factor which varies by latitude, is involved in Covid-19 mortality.

Controlling for social determinants of health

One of the main strategy statisticians use to identify whether a factor is important is to control for confounding factors. For example imagine if you had a dataset that showed a significant relationship between maternal weight and a child’s birth weight. If you had a suspicion that gestational time had some effect on both birth weight and maternal weight, you could include the length of the pregnancy as a term in the model. If maternal weight is just a spurious correlation, it won’t be a significant contributor to the model. The basic idea is that if a parameter is an important feature of the model, then either there is a real correlation between the feature and the outcome variable, or we’ve missed some other important factor that would explain away the relationship.

I think we should assume at the start that the virus itself is not making racial distinctions. Genetic factors, like sickle cells or blood type can change how a virus attacks a person, but there’s probably no common genetic pattern that unites all black and brown people in Northern countries. As a result, it’s hard to imagine a mechanism whereby the virus itself is picking out people of colour. Whatever effect race has on mortality is probably mediated by something like medical bias or proximity to the virus, and so when a study says that race is a significant predictor after controlling for x, y, and z, that means that we’re probably missing an explanatory feature.

There are two main studies that have attempted to control for social determinants of health when explaining Covid-19 outcomes:

First, a study out of the UK looked at NHS records and Covid-19 mortality. They controlled for preexisting conditions as well as a deprivation index that combines income, employment, premature death, education, crime, barriers to housing, and living environment. What they found is that after controlling for all of these factors, Black, Asian, and mixed race people had about twice the Covid mortality risk as white people.

This study confirms two out of the three causal pieces of Villarosa’s model of Covid mortality: proximity to the virus and pre-existing conditions. The deprivation index, which captures at least some of the socioeconomic factors that lead people to be more proximate to the virus, was associated with increased mortality, and pre-existing conditions were also found to lead to increased mortality. But since race still provides a lot of information, we know that there’s a missing factor.

The second study is a study of US veterans, which found that “Black and Hispanic individuals are experiencing an excess burden of Covid-19 not entirely explained by underlying medical conditions or where they live or receive care.” The main outcome of this study was positive Covid diagnosis, and it controlled for age, pre-existing conditions, and whether someone lived in an urban or rural setting. They found that Black veterans were more likely to be tested than white veterans, but were about twice as likely to test positive.

This study also controls for two pieces of Villarosa’s model: pre-existing conditions and healthcare bias. It directly controls for pre-existing conditions, but there also is a kind of implicit control on medical bias because the main outcome was a positive test. Logically, healthcare bias is more important the more treatment decisions a caregiver has to make. Giving birth is a great example: a doctor needs to decide whether to admit someone who might have a dangerous pregnancy, whether to induce, how to monitor fetal heart rate, and at what point to intervene with an emergency C-section. Bias creeps in at all of these points (and more); doctors might send a Black woman home when they would admit a white woman, monitor fetal heart rate less carefully, or intervene in dangerous labour a few minutes later. Each of these biased decisions endangers mother and child, and it’s a big part of why there’s a big racial gap in infant mortality.

By contrast, Covid diagnosis really only involves the one decision of whether to test or not, and we can assume that racial bias would tend to lead doctors to undertest Black people. A doctor might do a Covid-19 test on a white patient with marginal symptoms, while they may send a Black patient with the same symptoms home without a test. So bias in this case would reinforce, rather than undermine, the finding that Black and Hispanic people are at a higher risk of Covid-19 diagnosis, because the more bias there is, the fewer tests they would receive.

Neither of the studies above perfectly controls for the features in Villarosa’s model. It’s possible that the missing piece in the NHS model was biased medical care and that the missing piece in the VA model was proximity to the virus, but it should at least make you suspicious that there could be an important factor that isn’t being accounted for in the standard way of thinking about racial gaps in health outcomes.

Mendelian Randomization

A recent study provides a strong argument that the missing factor is vitamin D. The study uses a technique called mendelian randomisation, which requires a bit of explanation: In the ideal world, we would answer the question of whether vitamin D deficiency increases your risk of dying of Covid-19 by controlling for vitamin D in the model, but we don’t have information about everyone’s vitamin D levels. Mendelian randomization is the idea that if you have information about something that you know is related to the thing you wish you had data on, then you can include that as a proxy. In this case we know that people with darker skin tend to have lower vitamin D levels than people with lighter skin, and that that probably gets worse as you move north. The intuition here is that the skin’s ability to synthesize vitamin D is less important in Florida than Wisconsin because intense UVB radiation is abundant for most of the year. So this study uses race + latitude as a proxy for vitamin D levels.

The rules of mendelian randomization are that the randomizing factor can’t be directly associated with the outcome, and it can’t be associated with other explanatory factors.

So if you think the causal graph looks like this, mendelian randomization is appropriate:

The BMJ study found that excess mortality went up as you went north. Places like Alabama, which we typically think of as having bad racial health outcomes, had lower excess Covid mortality than places like Wisconsin, which we generally assume have at least somewhat better racial health outcomes. This is a pretty good argument for vitamin D’s involvement because it’s hard to find a story where latitude modifies other inputs into the Covid mortality. This isn’t to say that there’s no such story, but the most obvious one is that vitamin D deficiency increases risk of Covid-19 complication. This is the main graphic from the paper:

Does latitude modify other social determinants of health?

Mendelian randomization is not appropriate if the proxy also acts on your other predictor variables. For instance, you might think that because there are fewer Black people in Wisconsin than in Georgia, there are also fewer Black doctors, and so implicit racial bias might be worse. If that were the case, latitude wouldn’t give you much information about vitamin D because it could also be acting on the other social determinants of health, which we already know are important.

One quick check we can do to see if this graph makes sense is to look at infant mortality data by latitude. If excess infant mortality were higher in northern states than in southern ones, that would suggest that latitude acted on social determinants of health, because things like medical bias are likely affecting black mothers in the same way they are affecting black Covid patients. This data from the CDC shows that children of Black or African-American mothers die at a rate 2-3 times higher than the children of white mothers, and this rate doesn’t really change by latitude.

Does latitude act directly on mortality?

The other way that mendelian randomization can be invalid is if the proxy acts directly on the outcome variable. In this case, that would mean that latitude directly increased excess Covid mortality. At first glance it’s hard to think of how this would happen, but if you add testing to the model it becomes a bit clearer. If southern states have worse testing, and in particular are less likely to test Black people, Covid deaths would be misclassified as some other kind of mortality. The result would be that excess mortality would look lower, because Covid deaths among Black people were being under-counted.

I don’t have access to any data that shows testing by race and state, but we can look at the overall testing rate to see if that varies by latitude. The intuition here is that in places with abundant testing, racial bias would be less of a factor than in the places with scarce testing. Again, there doesn’t look to be much of a relationship here:


To reiterate, this argument is not to dispute the factors that Villarosa proposes. None of the papers about vitamin D and Covid challenge the idea that healthcare bias, pre-existing conditions, and proximity to the virus are major causes of excess mortality, and in fact all of the data supports that conclusion. What this does challenge is the idea that that’s the whole story. It seems like excess Covid mortality among people with darker skin is not completely explained by socioeconomics, location, or pre-existing medical condition. And if that’s true, it’s good news because it means that there may be something we can do about it. Vitamin D is a good candidate for that missing piece because it helps to explain why latitude would increase the racial gap in Covid mortality.