Africa has the world’s greatest genetic diversity, yet it’s missing from research: we’re filling the gap

Technology


Throughout history, most of the world’s genomic research has relied on DNA data from people of European ancestry.

A genome is the full DNA code of about three billion (a thousand million) bases, including all the chromosomes. Each person has two genomes: one from their mother and the other from their father.

Well resourced environments favour European-based research generating hundreds of thousands of whole human genomes with associated health data. Yet modern humans, our species, evolved on the African continent. African populations therefore contain the deepest branches of human genetic history and the greatest genetic diversity on the planet. Yet the continent remains strikingly underrepresented in global genomic databases.

The African continent is populated by people from over 2,000 ethnolinguistic groups, yet genetic data exist for fewer than a hundred groups. This is akin to having a GPS map of a city with only 5% of the streets marked and the rest left blank.

This bias has profoundly shaped modern medicine, from disease prediction tools to ancestry testing. And it’s why researchers increasingly recognise that studying African genomes has the potential to reveal insights and health-related biological pathways never observed before.

As a team of researchers we were involved in identifying under-represented groups in nine African countries for human whole-genome sequencing. Our multidisciplinary team involved in the Assessing Genetic Diversity in Africa project (AGenDA) has worked out ethical ways to obtain, record and share genetic material and to add to global databases.

The AGenDA dataset alone is expected to uncover millions of previously unknown genetic variants and analyses are underway. These discoveries will inform research into diseases that affect populations in African and worldwide. They include diabetes, heart disease, cancer and neurological or mental health conditions.

This is only a first step. Capturing the full scope of African genomic diversity will require hundreds of thousands of genomes. The project aims to bridge some of the most obvious gaps rather than fully map the continent’s diversity.

But expanding African genomic data is not only important for Africa. It will strengthen global biomedical science.

What it takes

Modern genomic science relies on large databases of DNA sequences to understand disease risk, ancestry and human evolution. These databases underpin a wide range of scientific and medical tools. They are used in medical research, disease prediction, drug development, ancestry testing and increasingly in artificial intelligence models that analyse health data.

When a population is absent from a reference database, a library of whole genome sequences, science simply cannot detect it. Genetic algorithms work by comparing individuals to reference populations. In the absence of a specific reference population, the algorithms will assign the closest available match.

This problem becomes particularly visible in ancestry testing. This is a form of genetic testing often used to learn more about biological heritage. Because African reference data remain incomplete, people with African ancestry may receive vague or misleading results about their origins.

Without more African genomic data the assignment of specific ancestry may be incorrect. In addition, disease risk predictions would be misleading. For example it has been shown that standard doses for medications like warfarin (a blood thinner) or efavirenz (an HIV medication) could be ineffective or toxic for people who harbour specific variants that are more common in African populations.

Prior knowledge of the distribution of such variants in a population could be key to deciding the suitability of a drug for patients from that population.

Filling some of the gaps

The AGenDA project was designed to begin addressing some of the gaps in genome data and African representation. This project involved large multi-country scientific collaborations across the continent. It also required co-ordinating research across multiple ethics committees, regulatory frameworks and institutions. Scientists collaborated with research partners in Angola, the Democratic Republic of Congo, Kenya, Libya, Mauritius, Rwanda, Tunisia and Zimbabwe.

The aim was not simply to increase the number of African genomes in global databases. Instead, the team carefully selected populations to address major geographic and ethnolinguistic gaps in genomic data.

But generating large genomic databases requires careful community engagement and consent from participants to share their data. Biological samples for DNA extraction must be collected and the sequencing performed one base at a time.

We therefore built community engagement and culturally appropriate consent processes into the project from the beginning.

More than 1,000 whole genomes were sequenced from communities that had rarely been included in previous genetic studies. These included:

  • hunter-gatherer populations

  • Nilo-Saharan-speaking communities

  • Afro-Asiatic speakers

  • understudied Bantu-speaking populations

  • communities from north Africa and the Indian Ocean islands.

Selecting samples required careful consideration of what African diversity actually represents.

Genetic diversity does not map neatly onto modern national borders. Instead, researchers considered a range of additional factors. These included:

  • poorly represented geographic regions in genomic databases

  • major ancestral population histories

  • languages spoken and self-identified ethnic groups

  • recent patterns of migration.

In some cases, neighbouring communities may appear close due to geographic proximity but have distinct genetic histories that reflect population separations thousands of years ago.

Why studying African genomes benefits science everywhere

African genomes contain more genetic variation than populations on any other continent. This diversity provides a powerful resource for scientific discovery. When researchers study more diverse populations they are better able to achieve a number of things.

Firstly, they can identify new genetic variants.

Secondl,y they can investigate evolutionary forces, like natural selection, that have shaped the genomes of people in different parts of the world.

And thirdly, they can pinpoint variants that influence health and disease.

More inclusive genomic datasets are also essential as genomics becomes integrated with artificial intelligence systems that analyse medical data and predict health outcomes. Future medical technologies could be biased to work best for whoever is represented in the data.

Ultimately, expanding African genomic representation will help ensure that the benefits of genomic medicine are shared more equitably. At the same time, it will improve the accuracy and depth of understanding in global genetic science.

The Conversation

Michele Ramsay is the South African Research Chair in Genomics and Bioinformatics of African populations. Funding for this work was from the National Institutes of Health (USA) and it was done in partnership with Illumina.

Ananyo Choudhury receives funding from the National Institutes of Health, USA, the South African Medical Research Council, and the Science for Africa Foundation



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *