A new study from the NIH’s All of Us program is shaking up long-held assumptions by revealing that genetic ancestry rarely aligns with racial labels—and that the interplay between biology and society is far messier than we like to admit. Race might be a helpful lens for understanding inequality, but it’s a terrible shortcut for decoding our DNA.
“The two concepts, the social and biological understandings of diversity, have been doing this sort of dance around each other, and sometimes they’re in productive juxtaposition. But oftentimes they’re entangled, and it’s just a mess.”
- Johnathan Kahn, Professor of Law and Biology, Northeastern University
The use of race in research is becoming increasingly controversial. As a social construct, it continues to provide validity as a variable for social science studies. As a genetic biomarker, its value has been increasingly questioned on simple issues, such as the algorithm’s ability to predict renal and pulmonary disease. It has become even more problematic in genetic studies. A new report, based on the NIH’s “All of Us” Research Program, may be putting to bed any beliefs that we can categorize genetics and physiology along racial lines.
“All of Us” is the longitudinal research program initiated in 2018 by the National Institutes of Health to advance precision medicine by collecting health data from over one million diverse participants across the United States. This new study, published in the American Journal of Human Genetics, is based on the “All of Us” biobank of genetic information on participants, the largest biobank in the US. Before jumping in, let’s get on the same page with some definitions.
- Race is self-reported, reflecting the “politically recognized definitions of race within the country used by the US Census.” It is a social construct if for no other reason than the categorization of race varies not only between countries but also among individuals.
- Ethnicity, a term seemingly closer to genetics, is also a social construct; in the US, we have two, Hispanic or non-Hispanic.
- Ancestry, based on genetic variations (alleles) found through our entire genome, is a genetic blueprint of our origins in human populations. It reflects our genetic inheritance, making it the most biologically relevant categorization.
The researchers examined over 230,000 unrelated whole genomes collected by “All of Us,” focusing on 2 million common genetic variants (SNPs) to facilitate computational efficiency. They applied Principal Component Analysis (PCA), a statistical technique that identifies major statistical patterns of genetic variation, termed principal components (PCs), to map out population structure and individual genetic ancestry.
The researchers initially identified broad gradients of genetic variation rather than utilizing predetermined population-based clusters. By projecting their results onto global and other known reference panels, they could place participants in a worldwide context and assign “unknown” ancestry based on how their DNA aligned with established patterns.
Admixture analysis complemented PCA by digging deeper into the specific ancestral contributions within an individual’s genome, estimating the proportion of a person’s ancestry that comes from multiple historical source populations. Like a report from Ancestry.com, this technique quantifies our ancestral blend.
Among the findings:
- Human genetic variation doesn’t fit neatly into our racial and ethnic categories. There were five broad patterns, or principal components (PCs), identified; however, individuals do not fall into isolated, distinct genetic groups corresponding to their self-identified race and ethnicity. Ancestry is more nuanced and complex.
- We are indeed a global melting pot, home to a significant number of migrants from nearly every country and continent. The “All of Us” genetic diversity often exceeded global reference datasets, underscoring the genetic richness of our populations. Black or African American participants spanned a continuum between African and European ancestry. White participants exhibited dominant European ancestry, with detectable traces of South Asian, African, and Native American ancestry. Hispanic and Latino rarely identified with a specific racial category, and when they did, they occupied the entire range: African, Native American, and European.
- Admixture analysis, which identified 13 likely “global” ancestral clusters, revealed considerable variability in individual ancestral proportions within each self-identified race and ethnicity category. We are all mutts. [1]
- Those admixture proportions varied across US states, reflecting where we came from, voluntarily or involuntarily, and where we had moved once we arrived, e.g., The Great Migration or the migrations brought on by the Dust Bowl. Southern Whites more often came from Southern Europe, those in the Midwest from Northern Europe. Hispanic or Latino participants showed more Native American ancestry in the Southwest and more African ancestry in the Northeast (e.g., New York).
Throwing the Baby Out with the Bathwater
The analysis revealed that associations between ancestry and biological traits were attenuated when socio-cultural or environmental factors were taken into account. Biological characteristics, such as height and BMI, are shaped not only by genetic factors but also by a complex web of socio-environmental factors. While ancestry contributes meaningfully—North European and West-Central African ancestry, for example, were linked to taller height, while Native American and South Asian ancestries were associated with shorter stature—these patterns shifted when researchers factored in variables like income, education, Zip Code, and even country of birth.
Adding self-identified race and ethnicity to their models improved predictions of traits like BMI and height, not because these categories represent genetics but because they help capture environmental and social influences, such as diet, stress, and access to care, that standard models often miss. For instance, Native American ancestry was linked to higher BMI, while East African ancestry showed the opposite trend, underscoring how subcontinental groups can have distinct and even opposing biological patterns.
Some takeaways
- Race and ethnicity, as used in surveys and the Census, reflect cultural and social identities—but don’t reliably capture our genetic ancestry
- The All of Us dataset does not fully capture all global ancestries. Some Native American, Middle Eastern, South Asian, and African hunter-gatherer groups remain underrepresented, highlighting the continued need for inclusive global sampling in genetic research.
- Genetic ancestry varies considerably within these groups and across different US regions, underscoring the need to assess fine-scale ancestries to control for confounding and advance precision medicine.
- To understand how genes influence health, we must also consider where people live, how they identify, and the environments that shape their lives.
The study underscores the importance of treating continental ancestry not as a single, uniform category but as a mosaic of distinct subcontinental ancestries, reflecting the rich regional variation within Africa, the Americas, Asia, and Europe.
Race is a poor proxy for genetics, but in some instances, it serves as a better proxy for outcomes, especially when socioeconomic factors are the primary determinant. Separating genetic ancestry from social identity is neither simple nor always possible. Traits such as height and BMI are shaped by both inherited variants and the environments in which we live. Incorporating race and ethnicity into models isn’t a validation of race as biology; it’s a recognition that social categories encode real-world differences that affect health, even if they do so indirectly.
The researchers recommend that future research adjust association models to prioritize direct measurement of environmental factors over using race or ethnicity as proxies—reserving such proxies only when no better data are available and their predictive value is empirically supported. Unfortunately, that may require a more nuanced understanding of how biology and society co-evolve than we currently possess.
[1] Hispanic or Latino: ~50% European, ~31% Native American, ~13% African ancestry on average. Black or African American: ~83% African, ~14% European, with some showing majority-European ancestry. White: Mostly (~90%) European, with ~8% South Asian ancestry appearing in some. Middle Eastern or North African: ~66% European, ~26% South Asian ancestry. Asian: Primarily East (~68%) and South Asian (~25%) ancestry. Native Hawaiian or Pacific Islander: Highly admixed, with ~37% East Asian ancestry.
Source: Subcontinental Genetic Variation In The All Of Us Research Program: Implications For Biomedical Research American Journal of Human Genetics DOI: 10.1016/j.ajhg.2025.04.012
Dr. Charles Dinerstein, M.D., MBA, FACS is Director of Medicine at the American Council on Science and Health. He has over 25 years of experience as a vascular surgeon.
https://www.acsh.org/news/2025/06/11/race-social-construct-or-genetic-biomarker-49543
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.