How to make good decisions from bad data
Nov 21, 2020
People often make a categorical distinction between randomized clinical trial data and other forms of data. Under this view the only information that can ground medical decision making is a large, multicenter, randomized clinical trial, and other study designs can only prove correlation, not causation. People who hold this view treat clinical trials as determinative of causation. Without a clinical trial you can’t make a causal claim, and once you have one, you no longer need to think that hard about causation.
Taking vitamin D back from the racists
Oct 23, 2020
Black and brown people in northern countries have been disproportionately affected by Covid-19. In the US, Sweden, Canada, and the UK, racialized people have been more likely to contract the disease, more likely to have severe courses, and more likely to die from it. The explanation you usually get for this is that excess mortality is caused by systemic racism or social determinants of health. Under this explanation, there’s nothing that surprising about the high Covid mortality because it’s just another example of discriminatory health care policies.
Masks and Vitamin D
Jun 21, 2020
Imagine that someone offered you a free lottery ticket. You would have a small chance of winning a million dollars, but the ticket doesn’t cost anything. It would be silly to turn down this ticket because you thought your odds of winning were either too small or too unclear; the only reason we care about the odds of winning a game is so that we can determine if the expected value of winning is higher than the expected cost of playing.
Race and Covid-19 mortality
May 23, 2020
In a recent interview, Linda Villarosa outlines the three major causes that she and other public health researchers have identified as causes for the huge racial gap in Covid mortality:
1) Proximity to the virus Black people live and work in environments where the virus is difficult to escape. They are more likely to work in essential services where it is difficult to engage in social distancing, and they are more likely to live in inter-generational homes in densely populated areas.
Vitamin D and Covid: We don't need to wait for more data.
May 15, 2020
I recently posted this graph on Twitter which suggests that Covid-19 mortality is related to latitude:
There’s a particular type of person on the internet who sees a graph like this and reaches deep into their data science boot-camp memories to exclaim “Correlation doesn’t imply causation!” or “There’s no randomized clinical trial!” or even “There are differences in testing strategy!”
These responses are stupid not because they’re wrong but because they ignore how decision-making works.
May 14, 2020
Let’s say you are a market-oriented dictator who is worried about Covid-19 and wants to reduce the size and number of social gatherings. You understand that gatherings generate a lot of value for your population and so you want a mechanism to ban the least valuable gatherings while allowing the high value ones to continue. For example, you would probably want to reduce the number of work karaoke parties while allowing people to go to their best friend’s wedding.
Vitamin D and Covid-19
May 3, 2020
In a recent piece about the puzzling ways that Covid-19 has spread across the world the New York Times explores a number of possible theories about why Covid-19 has affected some countries more grievously than others, including “demographics, culture, environment, and the speed of government responses.” I think Vitamin D status should probably be included in this conversation.
A tale of two countries Canada and Australia have had pretty similar Covid-19 timelines.
Why I use R
Dec 30, 2019
They said the war was over… Over the last couple of years prominent members of both the R and Python communities have tried to move past the language wars and support both R and Python workflows. This makes sense intellectually; after all, R and Python are not all that different in the scheme of things, and so we should let people use whichever language they find more productive. This conversation manifests very differently in the workplace, however.
Technical debt for data scientists
Apr 19, 2019
Technical debt is the process of avoiding work today by promising to do work tomorrow. A team might identify that there’s a small time window for a particular change to be implemented and the only way they can hit that window is to take shortcuts in the development process. They might soberly calculate that the benefits of getting something done now are worth the costs of fixing it later.
Testing machine learning models with testthat
May 1, 2018
Automated testing is a huge part of software development. Once a project reaches a certain level of complexity, the only way that it can be maintained is if it has a set of tests that identify the main functionality and allow you to verify that functionality is intact. Without tests, it’s difficult or impossible to identify where errors are occurring, and to fix those errors without causing further problems.