The Origin: A Data Ghost in the Pandemic
In 2020, as the COVID-19 pandemic swept through North Carolina, the official state briefings told a story that didn’t add up.
Latino communities were on the front lines as essential workers. Yet the reported death tolls showed them accounting for only 5% of deaths, despite representing 10.7% of the population.
On the surface, it looked like they were “doing well.”
But something felt off.
Looking Closer
As a healthcare professional and former medical interpreter, I looked more closely at the data. That’s when I saw it: a massive category labeled “Missing Ethnicity.”
These deaths were nearly three times the number of reported Latino deaths.
Because these individuals weren’t identified, they were excluded from the percentage calculations—an effect known as denominator exclusion, where missing data is systematically left of the totals used to calculate rates.
In public health reporting, this practice can produce misleading estimates if not explicitly addressed, as it systematically removes unknown cases from the population being measured.

The Problem
The data wasn’t just incomplete—it was misleading.
The systems—from the hospital intake to public health reporting protocols—were failing to capture the reality of a vulnerable population.
What appeared to be a lower burden of mortality was, in part, a product of systematic missingness.
The Realization
If the Social Determinants of Health (SDOH) explain why people get sick, the Social Determinants of Data (SDOD) help explain why data becomes incomplete, distorted, or misleading.
The Evidence
What the data didn’t show
After the initial peak in Latino deaths, the level of missing ethnicity data often exceeded the number of reported Latino deaths—especially during COVID-19 surges.

The Shift
The questions was no longer:
Why is the data missing?
But:
What conditions produce missing data?
Birth of SDOD
This shift led to the development of the Social Determinants of Data (SDOD)—a framework for understanding how systemic, institutional, and social forces shape what becomes data, and what does not.