A Twitter thread from Alysasah “Ali” Sewell includes a starter kit for researchers to move beyond the inclusion of a categorical variable for “race” (and by extension “ethnicity”) in social science research, in particular the design of quantitative models. The inclusion of these variables can lead to all sorts of issues in causal analysis, especially when the “race” variable conflates multiple factors, some of which could be mediators or colliders. We should do better, including (maybe especially?) in industrial data science and machine learning, where we are still trying to figure out how to get beyond our fear of using protected class attributes in our models (mostly because we don’t want to get sued), despite the opportunity for using such research to become a more inclusive organization whose research not only delights customers, but combats systemic racism (and other types of bigotry).
Here is the reading list for future reference.
Beyond “race” variable reading list
- Fatal Invention by Roberts (2011)
- Thicker than blood: How racial statistics lie by Zuberi (2001)
- White logic, white methods: Racism and methodology by Zuberi and Bonilla-Silva (2008)
- Rethinking race and ethnicity in research methods by Stanfield (2016)
Also take note of the researchers Dr. Sewell references in her thread, many of whom are on Twitter.
Roberts, D. 2011. Fatal Invention: How Science, Politics, and Big Business Re-Create Race in the Twenty-First Century. New Press.
Stanfield, J.H. 2016. Rethinking Race and Ethnicity in Research Methods. Taylor & Francis.
Zuberi, T. 2001. Thicker Than Blood: How Racial Statistics Lie. University of Minnesota Press.
Zuberi, T., and E. Bonilla-Silva. 2008. White Logic, White Methods: Racism and Methodology. G - Reference, Information and Interdisciplinary Subjects Series. Rowman & Littlefield Publishers.