Science in Society Archive

Celebrating the Uses of Human Genome Diversity

& Dissecting the Controversies

Human genome diversity has been successfully used to chart the fascinating prehistory of human evolution but controversies continue over the commercial exploitation of human cells and genes and the lack of honesty and respect for participants on the part of scientists Dr. Mae-Wan Ho

Charting the prehistory of human evolution

What are the most important contributions made in ten years of the human genome sequence?

My personal nominations would certainly include mapping human genetic diversity to chart the prehistory of human evolution, a subject that has fascinated me since I taught myself population genetics as a post-doctoral fellow seeking my first teaching job.

The research team at Stanford University, California, in the United States, led by human population geneticists Luigi Luca Cavalli-Sforza, Richard Myers and Marcus Feldman were able to use human genome diversity data to out the ancestry of individuals and human populations, providing genetic evidence that the modern human species has evolved by successive migrations out of sub-Saharan Africa.

Individual ancestries segregated into continental population groups

The researchers carried out a study of 650 000 common single nucleotide polymorphisms (SNPs) in the genomes of 938 unrelated individuals from 51 populations of the Human Genome Diversity Panel [1]. SNPs are common sites in the genome identified by single nucleotide changes. The populations included in the study were from sub-Saharan Africa, North Africa, Europe, the Middle East, South/Central Asia, East Asia, Oceania, and the Americas.

In analyzing the data, the researchers first worked out the genetic ancestry of each individual without using his/her population identity. Each person’s genome was considered as having originated from K ancestral populations the contributions of which are described by K coefficients that sum to 1 for each individual. Figure 1 shows the results for K = 7, the lowest number that gives the best segregation into distinct populations

Figure 1 - Human ancestries

 At K = 5, the 938 individuals segregated into five continental groups similar to those reported in a microsatellite study of the same panel. (Microsatellites are sequences of simple di- or tri-nucleotide repeats of varying lengths distributed widely throughout the genome.) At K = 6, the new components account for a major portion of ancestry for individuals from South/Central Asia, separating this region from the Middle East and Europe.. This result differs from the microsatellite study, which failed to separate the group. At K =7 the new component occurs at highest proportions in the Middle Eastern populations, separating them from European populations

In many populations, ancestry is derived predominantly from one of the inferred components, but in others, especially those in the Middle East and South/Central Asia, there are multiple sources of ancestry. For example, Palestinian, Druze, and Bedouins have contributions from the Middle East, Europe and South/Central Asia. Burusho, Pathan, and Sindhi have an East Asia contribution. Hazara and Uygur share a similar profile of combined South/Central Asia, East Asian, and European ancestry.

East Asia appears homogenous, but finer substructure can be detected when individual regions are analysed separately. For example, two components separate the 16 East Asian populations and correspond to a north-south genetic gradient. Han Chinese can be divided into a southern and a northern group.

Mixed ancestries inferred from genetic data can often be interpreted as arising from recent admixture among multiple founder populations. But in the Stanford analysis, the estimated mixed ancestry can be due either to recent admixture or to shared ancestry before the divergence of two populations. For example, the European and Asian ancestries seen in Uygur and Hazan populations are likely due to relatively recent admixture, whereas the inferred Native American ancestry in Yakuts and Russians probably reflects shared ancestry before the predecessors of the Native Americans crossed the Bering Strait. The Middle Eastern populations may have experienced both continuous gene flow and shared ancestry with the rest of Eurasia.

Individuals belonging to the same recognized population almost always show similar ancestry populations. Therefore, it is meaningful to evaluate the genetic relationships among populations, and construct a phylogenetic tree by the maximum likelihood method, using chimpanzee alleles as the outgroup (for comparison); chimpanzee being the closest simian relative for which data exist.  The resulting tree is shown in Figure 2.

Figure 2 - Human phylogenies

The sub-Saharan African populations are located nearest to the root of the tree, outward from which are branches that correspond, sequentially, to populations from North Africa, the Middle East, Europe, South Central Asia, Oceania, America and East Asia. The branching pattern largely agrees with the approximate order of human expansion and supports the “out of Africa” model or human origins.

These results were confirmed by another statistical technique for creating clusters, principal component analysis.

Further confirmation of  “out of Africa” model of human origins

SNP haplotype heterozygosity (proportion of SNP sites that are different in the two paired chromosomes of an individual), another measure of genetic diversity, was found to be highest in sub-Saharan African, and decreases steadily with distance from this region. The mean heterozygosity across autosomal (non-sex chromosome) haplotypes is negatively correlated with distance from Addis Ababa, Ethiopia, with a correlation coefficient r of –0.91 and a slope of -1.1 x10-5 per km. This trend is consistent with a serial founder effect, a scenario in which population expansion follows successive migration of a small number of individuals out of the previous location, starting from a single origin in sub-Saharan Africa. A similar trend was found for X-Chromosome haplotype heterozygosity, and for microsatellite heterozygosity reported previously.

By genotyping two chimpanzee samples, the researchers were able to define the putative ancestral allele (form) for some 95.5 percent of the SNPs in the 650 000 panel. The distribution of these ancestral allele frequencies was investigated among the 51 human populations. They show a progressive decline with distance from Africa, from ~0.04 in sub-Saharan Africa, to ~0.03 in Euraisa, ~0.02 in East Asia, and ~0.01 in Oceania and the Americas.

More than 90 percent of genetic diversity exists within populations

The researchers also carried out an analysis to partition the overall genetic variation into three components; within population, among population within geographic region, and among geographic regions. Within population variation accounts for 88.9 percent of total, while variation between populations within a geographic region accounts for 2.1 percent and variation between geographic regions accounts for 9 percent. For comparison, the figures for microsatellite markers are 94.0 percent, 2.3 percent and 3.7 percent respectively. For X chromosome SNPs, the figures are 84.7 percent, 2.4percent and 12.9 percent respectively, consistent with figures given for X chromosome microsatelleites.

Together, these results reaffirm that within-population variation accounts for most of the genetic diversity in humans. In other words, the average genetic difference between populations or ethnic groups is less than ten percent of the total variation, giving little credence to the idea the genetic differences define race, let alone racism.

Nevertheless, the analysis showed that human populations are distinguishable; suggesting that self-reported ancestry is sufficiently accurate for assessing population characteristics including perhaps risks to diseases. But, the researchers concluded that the observed population structures can be “largely explained by random drift at neutral loci”, i.e., genes that have no effect on survival; except for a few that are adaptations to climate conditions.

Despite these reassurances, the study of human genetic diversity continues to attract controversy.

Early controversies

Mapping the genetic diversity of human populations had been the aspiration of the population geneticists who initiated the Human Genome Diversity Project (HGDP). According to an account of its history, the aim of the HGDP was to sample and preserve DNA from “isolated indigenous populations” before social changes rendered them useless for answering questions about human evolution [2, 3]. But from its inception around 1991 to its “unofficial death” less than a decade later, it was attacked by indigenous rights groups as racist and neo-colonialist. Worse yet, it encouraged unscrupulous ‘gene hunters’ scouring the face of the globe to take blood from indigenous tribes in the hope of finding rare genetic variants that could be patented for producing lucrative cures of common diseases.

In reality, the HGDP did not die, even though it did not get funded. Cavalli-Sforza who initiated the project, pointed out that population geneticists have been interested in the potential of genetic data to provide information on the history and geography of human populations for much of the past century [4]. However, it was only when the Human Genome Project was in full swing that the idea of a large-scale systematic study of human genome variations was raised. The then president of the Human Genome Organisation, Sir Walter Bodmer, asked him to chair a committee to study the feasibility of a human genome variation project, later named the Human Genome Diversity Project. The US National Institutes of Health (NIH) Institute for General Medical Sciences, the US National Science Foundation, and initially also US Department of Energy supported four symposia, between 1991 and 1994 that addressed the genetic, statistical, anthropological issues, general organization and molecular and ethical issues related to HGDP.

Cavalli-Sforza recalled the political and ethical difficulties especially surrounding the fear that indigenous peoples’ DNA might be exploited for commercial purposes. But, he said that [4] “since its inception, the HGDP has avoided commercial interests, and when the project was finally ready to be launched, it was made clear that the DNA samples would be provided only to non-profit-making laboratories. The HGDP has always opposed the patenting of DNA, to allow the study of genetic variation for fundamental research purposes.” Similarly, the charge of racism by ‘naïve observers’, ignored “the fact that half a century of research into human variation has supported the opposite point of view – there is no scientific basis for racism.”

Indeed, the scientists struggled to comprehend the hostility against the HGDP. They were [2] “a politically progressive and socially sensitive lot” not out to make money but to pursue what they thought was important and urgent research. It was particularly galling to be tarred with the brush of racism given their personal histories; “Luigi Luca Cavalli-Sforza had been a trenchant critic of William Shockley’s claim of black genetic inferiority; Robert Cook-Deegan had a long record of involvement with Physicians for Human Rights; and Mary Claire King had worked with the grandmothers of the Plaza de Mayo to identify children kidnapped during Argentina’s dirty war.”

At the time, agencies that had financed the HGDP organizational symposia asked the US National Research Council of the National Academy of Sciences (NAS-NRC) to convene a committee to study the feasibility and ethics of the project [4]. From 1994 to 1997, while the NRC committee was organised, met and wrote its report HGDP was effectively stalled.

Since the beginning, the organizers of the HGDP were convinced that the crucial first effort was to establish a collection of lymphoblastic cell lines (LCLs) from many populations, rather than simply collecting DNA samples, for reasons of accuracy and renewability [4]. The fact that LCLs had already been made from worldwide populations by researchers of human evolution also supported the validity of the approach, and the donation of these lines to the HGDP made immediate funding unnecessary, as several research workers who had collected cell lines from indigenous peoples “unanimously agreed” to contribute cell lines to a central collection that would form the core of the HGDP.

The NAS-NRC committee report, made public at the end of 1996 recommended that the HGDP could proceed, with particular attention being paid to informed consent and related ethical issues. But by that time, funders had decided not to support the HGDP.

Research continued despite lack of funding

The researchers who initiated the HGDP made clear at the outset that they would continue the research, even if funding were not forthcoming, and they did [4].

The Center for the Study of Human Polymorphism (CEPH) in Paris, France agreed to house and distribute the collection, since referred to as the Human Genome Diversity Panel [1], as they already had LCLs from 40 big family groups, and hence the facilities needed for storing cell lines and distributing large numbers of DNA samples [4]. All five continents are represented in the collection, and all samples are from populations of anthropological interest, i.e., those that were in place before the great diasporas (migrations of populations) started in the fifteenth and sixteenth centuries, when navigation of the oceans became possible. That was important, because these diasporas caused significant population admixtures, especially in the Americas. “Only genetic knowledge of the original populations that contributed to these admixtures can disentangle the various genetic complexities that resulted.”

The HGDP collection was to include more than 1 000 cell lines. The establishment of the HGDP collection, list of populations included in it and the conditions for obtaining DNA samples were announced in April 2002. Labs that request samples must be non-profit-making and must send results of their studies to a CEPH database that will be made available to other researchers. No cell lines would be distributed. The current collection consists of 1 064 cell lines from 52 populations around the world. By July 2004, 56 laboratories had requested and obtained the collection.  

The collection is an important resource for human population genetics and evolutionary studies as well as for biomedical studies, Cavalli-Sforza stated [4]. It has survived with little support, but would need increased funding. The main future requirement is to increase the number of cell lines especially from areas now insufficiently represented. Israel, Pakistan and China are well represented. In contrast, India and Polynesia are not represented at all, and Europe, Northern Asia, the Americas and Oceania have limited representation.

Ethical, legal and social issues in the HGDP

According to Cavalli-Sforza [4], the US NAS-NRC provided general guidelines to ensure that the needs for confidentiality and anonymity are properly addressed, and informed consent obtained for each cell line, and that the subjects were aware of the possible uses of the data, conforming with the legal requirements of each country. All the cell lines contributed to the collection were therefore reviewed to make sure that they had been collected in an ethical and legal manner. Only cell lines that complied with the requirements were included in the HGDP resource. The vetting of cells lines and the protocol for confidentiality protection were reviewed by an ethics advisory committee approved by the US NIH for General Medical Science. The only information that remained attached to each cell line concerned ethnic and geographical origin (in deg latitude and longitude), and sex.

Cavalli-Sforza stressed [4]: “From an ethical point of view, studies of human population genetics and evolution have generated the strongest proof that there is no scientific basis for racism, with the demonstration that human genetic diversity between population is small and perhaps entirely the result of climatic adaptation and random drift.”

Isn’t that just the kind of research everyone should support and applaud?

Continuing controversy

Jonathan Marks, a professor of Anthropology at the University of North Carolina at Charlotte and a long-time critic of the HGDP remarked [5]: “Unfortunately it was proposed at the beginning of a new era for US anthropologists, of heightened sensibilities on relevant issues such as indigenous property rights…While the HGDP managed to control the scientific discourse for several years, and dismiss any challenges to it as coming from the dark realm of anti-science, it was ultimately deemed unfundable because of its failure to grapple with the bioethical questions it raised – about consent, disclosure, coercion, identity, economics and race.”

The Genographic Project with similar aims, begun in 2005, merely circumvented those issues by having private funding in place at the outset. But there is a growing popular consciousness questioning whether [5], “in the era of free-market genomics and biotechnology”, the science of human cells and genes is really there to deliver “the Baconian promise of a better life for all,” or simply “serving the ends of scientists and shareholders.”

Marks described the legal case of the Havasupai, an Impoverished Indian tribe living in northern Arizona at the base of the Grand Canyon that was approached by geneticists from Arizona State University for blood samples in the early 1990s. The Havasupai understood that the samples were to be used to help find a cure for diabetes, which afflicts them and many other Native American groups.

In  2003, a member of the Havasupa tribe enrolled at Arizona State University, and discovered accidentally that the blood samples taken for diabetes research were also being used in research on schizophrenia, inbreeding, and population history, without the knowledge or explicit consent of the participants. Not only that, the blood samples were in effect used to cast the tribe in what seemed to them a very negative light, as inbred schizophrenics. Moreover, the population history research contradicted the tribe’s idea of their origin.

In 2004, they filed a $50 million lawsuit against Arizona State University, which was eventually settled out of court in April 2010. The settlement included a cash payment of $700 000 and return of the samples. “More significantly, perhaps, are the provisions for collaborations between the Arizona Board of Regents and the Havasupai people in areas such as health, education, economic development and engineering planning.” Marks wrote. For example, the tribe will collaborate with the University to seek funding to build a clinic and a high school, and Havasupai tribal members will be eligible for scholarships at ASUI, University of Arizona and Northern Arizona University.

There has also been a backlash to the Havasupai case with some accusations of mass ‘anti-science’ attitudes among the Indians and their sympathizers. Marks asked these critics to consider [5] “how the progress of science could actually be held back by scientists being honest, generous and respectful towards participants. It’s the behaviour we would expect of any social actor. Why should scientists be exempt?”

Article first published 29/07/10


    Li JZ, Absher DM, Tan H et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science 2008, 319, 1100-4. Cavalli-Sforza LL and Bodmer WF. The Genetics of Human Populations. W. H. Freeman, San Francisco, 1971, reprinted 1999 by Dover Publications. Reardon J. Race to the Finish: Identity and Governance in an Age of Genomics, Princeton University Press, Princeton, 2005. Paul D. Diversity and controvery. Nature 2005, 437, 621-2 Cavalli-Sforza LL. The Human Genome Diversity Project: past, present and future. Nature Reviews Genetics 2005, 6, 333-40.

Got something to say about this page? Comment

Comment on this article

Comments may be published. All comments are moderated. Name and email details are required.

Email address:
Your comments:
Anti spam question:
How many legs on a cat?

There are 2 comments on this article so far. Add your comment above.

Rory Short Comment left 30th July 2010 21:09:20
Humans being what they are the results of scientific activities are almost bound to be contraversial, this does not mean that we should therefore cease all scientific activity but rather that we must work our way constructively through the controversies.

Todd Millions Comment left 19th September 2010 15:03:37
Dr-Mae; Please punch up 'Primitive Humans Conquered Sea,suprising finds suggest'.National geographic feb17 2010. In reference too this study-how would the coasting migrations this implies change the -'inferreds' and 'max likeihoods' you refer too? For instance-human pops doing the Bering sea drift along ice covered shore lines, till they find warm lands in say mexico-during the prevoius too last ice age say?