Case Study

If you are invisible in data, you are invisible in policy.

The Case Study

13.6% of NC COVID-19 death records were missing ethnicity in 2020

The map shows the extent of the missingness of ethnicity data by county as a percentage of the total deaths. Darker counties represent counties with higher percentages of missing data.


In 2020, as North Carolina faced a global health crisis, a significant portion of the population was quietly erased from the narrative. My research uncovered that 13.6% of COVID-19 death records were missing ethnicity data, creating a “data ghost” that distorted public health understanding and response. This case study explores how systemic failures in reporting led to a gross underrepresentation of the Latino community in official statistics. Because cases with missing ethnicity were excluded from percentage calculations, this “missingness” did not just reduce data quality—it actively distorted the reported rates.

Key Findings

Statistical Invisibility: Although Latinos made up 10.7% of the NC population, missing data reduced their reported death rate to 5%, masking the true impact of the pandemic.

Systemic Failure: Missing ethnicity data was not random—it reflected systemic gaps, including limited training for death certifiers and inadequate technological infrastructure.

Policy Consequences: Because the community appeared to be “doing well” in the data, resources were often misallocated away from those most affected.

Together, these findings demonstrate how structural conditions—not just individual errors—produce data gaps, illustrating the core premise of the Social Determinants of Data.

Read the Full Study

Dying to be Counted: The Social Determinants of Data—A Critical Analysis of the Quality of the 2020 North Carolina Latino COVID-19 Mortality Data