Ten years of the Human Genome

Reams of Data and No Progress in Sight

Advances in DNA sequencing and computation technologies have heralded in ‘big biology’ and massive proliferation of data but no progress in understanding life, health or disease; the life science community is caught between the Scylla of reductionism and Charybdis of ‘systems biology’.
Dr. Mae-Wan Ho

She landed the capsule softly on the sea floor. The air-pressure readjusted with a faint hiss, like a sigh of relief.

The AQuod is a miracle of engineering design for stealth underwater manoeuvres. It had successfully eluded the thicket of coast guards and submarines patrolling the area, though she had to navigate the obstruction course with great skill, and there were near misses. At one point, she was barely metres from the hulk of a giant submarine before swerving sharply away.

Powered electrically by a small low energy nuclear reactor on board and super-streamlined, the AQuod darts through water silently like a fish, with little or no turbulence in its wake. Not only is it constructed entirely of materials transparent to radar, the outer surface is also invested in video-cameras that project constant streams of images backwards, making its effectively invisible both in the water and from the satellites.

A quick scan of her surroundings via the cameras confirmed that she had successfully navigated to the south side of the island behind the patrolling coast guards where the secret entrance to the complex was located. She could make out the shallow contours of the shoreline, and the feathery fronds of seaweeds waving sinuously around the capsule. Swiftly, she donned her Skinflint, beginning with the legs and the arms that end snugly in toes and fingers; then pulled the hood over her head and secured the suit over her chest and full belly. It clung to her like a second skin, and would insulate against the ice-cold water, protecting from teeth and fangs and other sharp objects while enabling her to move like a shark with night vision. After checking that her naked body is completely covered, she pressed a tiny button on the left wrist, and instantly became invisible, even to herself.

She took a deep breath before venturing out; her mission to destroy the Goomizon, a supercomputer complex now under the official jurisdiction of the United Nations Human Genome Commission, though the companies that set up the database storage facility still maintained commercial control. Security has been tight ever since the “Genome Warriors” vowed to destroy the Goomizon after numerous failed attempts to hack into the database to wipe it clean.

The Goomizon has the genome sequences of about half the human species stored, including the 60 millions of those proposed for ‘humane termination’ because they carry the wrong combinations of SNPs in their genomes, her own and her unborn son’s among them….

Muted celebrations

The tenth anniversary of the human genome sequence could hardly pass without fanfare. So the great and the good gathered once more; though celebrations, if they qualified as such, have been relatively low-key and muted [1] The achievements have been modest if not disappointing, especially measured against the promises made when the genome was unveiled ten years ago in the White House.

Clinton said it would [2] “revolutionize the diagnosis, prevention and treatment of most, if not all human diseases.” At a news conference, Francis Collins, then director of the genome agency at the National Institutes of Health said that genetic diagnosis of diseases would be accomplished in 10 years and that treatments would start to roll out perhaps five years after that. Some of us had expressed considerable doubt as to whether these promises could ever be delivered [3] (Human Genome -The Biggest Sellout in Human History, I-SIS TWN report).

The successes in DNA sequencing ‘big biology’

The major success in the ten years since the first human genome map is the phenomenal advances in DNA sequencing technologies. It took ten years of international effort costing billions of dollars to sequence the first human genome. Today, this can be done in a day, in a single machine for just a few thousand dollars, said Craig Venter [4], who headed the private company Celera that came up from behind to a ‘joint finish’ with the public consortium in the race to sequence the first human genome. Genomes can be sequenced around 50 000 times faster today than in 2000. An important part of the acceleration is that the first human genome sequence can now serve as a reference against which data from new sequences can be compared.

Sequenced genomes of non-human species tally more than 3 800. A total of 13 complete individual human genomes have been released, including that of Archbishop Desmond Tutu of South Africa [2]. In 2011, an international team is set to complete the data-producing phase of the ‘1000 Genome Project’, to produce highly accurate assembled sequences from more than 1 000 individuals whose ancestors came from Europe, Asia and Africa.

Meanwhile the Genome 10K Project was launched to assemble a ‘genomic zoo’ – a collection of the genomes of 10 000 vertebrate species, about one for every vertebrate genus that exists [5]. And the Welcome Trust announced a study to sequence 10 000 genomes in the UK [6], amounting to about 1 in every 6 000 individuals in the country, in the hope of uncovering many ‘rare genetic variants that are important in human disease’.

The initiation of ‘big science’ for biology [2] is certainly counted among the major successes of the human genome map. Other big biology efforts include the International HapMap Project (2002-2005), which charted the single nucleotide polymorphisms (SNPs), the points at which human genomes commonly differ (with the intention of assembling these into association clusters and clusters into groups or ‘haplotypes’), the Encylopedia of DNA elements (ENCODE), which aims to identify every functional element in the human genome, and the Genome-Wide Association Studies (GWAS) to uncover common DNA variations important for human diseases.

For science, the most important decision was to make the human sequence data freely available online, an effort coordinated by the US Department of Energy and the National Institutes of Health (NIH) [7]. This has contributed to sequencing the Neanderthal genome and tracing the origins of the human species (see Box), both much lauded in the press.

But the more significant, if not the most significant contribution of the human genome to science in my view - hardly mentioned in the press except obliquely - is the final demise of genetic determinism in favour of epigenetics and the fluid genome, in which environmental influences play the key role in marking and even changing genes [8 -11] (see The Human Genome Map, the Death of Genetic Determinism and Beyond, ISIS report; Death of the Central Dogma and other articles in the series, SiS 24; Epigenetic Inheritance - What Genes Remember and other articles in the series, SiS 41; Nurturing Nature, ISIS scientific publication).

Unfortunately, vested interests and lack of imagination are conspiring to keep the genome myth alive. So let’s take stock of what has really been achieved.

The story of human evolution from the human genome

The story of how our species evolved unfolds around sequencing the Neanderthal genome and mapping the diversity of human populations

Neanderthals, the closest relatives of today’s humans, first appeared in the European fossil record about 400 000 years ago, and lived in large parts of Europe and western Asia before disappearing 30 000 years bp (before present). But analysis of the newly sequenced Neanderthal genome showed that the Neanderthal genome is more closely related to non-African genomes from Asia, Europe and Papua New Guinea than to African genomes. This suggests that Neanderthals interbred with modern humans in the Middle East after the modern humans left Africa, but before they spread into Asia and Europe [12, 13].

The genomes of Neanderthals and modern humans are 99.84 percent identical. Sequencing the Neanderthal genome was no mean feat, as it was extremely difficult to extract sterile DNA from fossil bone in the first place, and without the standard human genome sequence, it would have been impossible to screen out contaminating bacterial sequences. The work involved 56 scientists from 20 laboratories.

Researchers at Stanford University, California, in the US carried out a study of 650 000 common SNPs in 938 unrelated individuals from 51 populations from around the world: sub-Saharan Africa, North Africa, Europe, the Middle East, South/Central Asia, East Asia, Oceania, and the Americas [14]. By analyzing each genome separately, they were able to segregate the 938 individuals into the major continental groups consistent with theories from archaeology and linguistics indicating that the ancestors of many human populations originated in Africa (the “out-of-Africa” hypothesis). Significantly, most of the genetic diversity (88.9-94.percent) exists within populations or ethnic groups, while the difference between ethnic groups accounts for less than ten percent of the genetic diversity, giving no substance to any genetic theories on race, much less racism. The genetic diversity of populations was found to be highest in sub-Saharan Africa, and decreases steadily with distance from the region. This trend is consistent with a serial ‘founder effect’ in which population expansion follows successive migrations of a small number of individuals out of the previous location, starting from a single origin in sub-Saharan Africa. Other smaller studies came up with similar findings, as for example, an analysis of 525 910 SNPs and 396 copy number variable loci (variable repeats in the genome) in a worldwide sample of 29 populations [15].

The study of human diversity has been mired in controversy that may not have been fully resolved [16] (see Celebrating the Uses of Human Genome Diversity, SiS 48)

No good for business

For Fortune magazine, it was “The great DNA letdown” [17]. In terms of human health, the human genome sequence has failed to live up to the hype. Furthermore, the obsession with DNA may have hindered the development of other approaches to understanding and treating diseases. “If nothing else, it diverted resources to companies and efforts that are now almost all gone – bankrupt, sold, or redirected away from pure genomics.”

The number of novel drugs approved was 25 in 2009 (up from only 11 in 2005) compared to 53 in 1996.

Most of the gene tests offered by online companies such as 23and me and deCodeme, are not validated; and even if they have been validated, usually reveal only a slightly increased risk factor. Just 35 000 customers have signed up for 23and me, a company much hyped since it was launched in 2007 and Time magazine named it along with personalized genomes “the innovation of the year” in 2008.

An editorial in Nature [18] noted that biotech companies including Celera, Decode Genetics in Reykjavik, Iceland, and Human Genome Sciences of Rockville, have had to rethink their optimistic assumption that selling human genetic information could turn a profit. Excitement over start-up companies offering personal genetic testing has withered just as fast when it became clear that their predictions have little ‘actionable’ value.

Little to show for health

Has human health benefited from sequencing the human genome? Nature reported [18] a “startlingly honest response… from leaders of the public and private efforts, Francis Collins and Craig Venter, both say ‘not much’.” New York Times concurred [19]: “A decade later, genetic map yields few new cures.”

Francis Collins then head of the public consortium to sequence the human genome, and now director of the US National Institutes of Health, admitted [2]: “The consequences for clinical medicine, however, have thus far been modest.” Despite some major advances, “it is fair to say that the Human Genome Project has not yet directly affected the health care of most individuals.”

Nevertheless, Collins valiantly attempted to save the situation [2]: “The promise of a revolution in human health remains quite real. Those who somehow expected dramatic results overnight may be disappointed, but should remember that genomics obey the First Law of Technology: we invariably overestimate the short-term impacts of new technologies and underestimate their longer term effects.” But these hopes too, are all but dashed by the latest revelations from the laboratory.

Currently some 850 sites on the genome, most of them located near genes, have been implicated in common diseases, said Eric Lander [19], director of the Broad institute, Cambridge, Massachusetts, and a leader of the HapMap project. He feels strongly that the project and its motivating hypothesis has been vindicated.

But most of the sites linked with diseases are not in genes, and have no known function. Some geneticists suspect that the associations are spurious

The research team led by Nina Paynter of Brigham and Women’s Hospital in Boston looked at 101 SNP variants that had been linked to heart disease by following 19 000 women for 12 years [20]. These 101 SNP variants together, turned out to have no value in predicting the disease. In contrast, family history was the most significant predictor, as it had been before genomics.

Robert Weinberg, cancer researcher at Whitehead Institute and MIT, Cambridge, remarked [21] that there is little to show for all the time and money invested in genomics studies of cancer.

Revolution in biology or endless proliferation of data

Science journal Nature’s poll of more than 1 000 life scientists returned the verdict that “the hoped-for revolution against human disease has not arrived [22].” What the sequence has brought about is a revolution in biology; 69 percent of respondents said that the human genome project inspired them either to become a scientist or to change the direction of their research. Some 90 percent said their own research has benefited from sequencing the human genome, with 46 percent saying that it has done so “significantly”. Almost one-third use the sequence “almost daily” in their research. “For young researchers like me, it’s hard to imagine how biologists managed without it,” wrote one scientist.

The impact on life science is hardly surprising, considering the huge investments poured into sequencing and genomics, which is bound to change how biology is done. Robert Weinberg, for one, is unhappy about the revolution in biology and the increasing proportion of US national research budgets being swallowed up in genomics and allied ‘systems biology’ research at the expense of ‘small-scale, hypothesis driven projects’.

The biggest impacts of the genome sequence, according to the poll, have been advancing the tools of the trade, sequencing and computational biology. But the “data dreams” are spawning “analysis nightmare”.

There’s a sort of disappointment that despite having so much data, there is still so much we don’t understand, said David Lipman, director of the US National Center for Biotechnology Information in Bethesda, Maryland. Lack of adequate software to analyse the data, shortage of bioinformaticians and raw computing power are also among the major problems. ‘Cloud computing’ is being considered, in which labs buy computing power and storage in remote computing farms from companies such as Google, Amazon, and Microsoft. The European Nucleotide Archive, launched 10 May 2010 at the European Molecular Biology Laboratory’s European Bioinformatics Institute in Cambridge, UK will offer labs free remote storage of their genome data and the use of bioinformatics tools.

Data seem likely to proliferate massively and endlessly. For Todd Golub, director of the Cancer Program at the Broad Institute, Cambridge, Massachusetts, large unbiased genomic surveys is the way forward [23]. Other researchers, not content with DNA (genomes), transcribed RNA (transcriptomes), and proteins (proteomes), are asking what differences it makes to the phenotype or morphology and function of cells when each gene is silenced by RNA interference [24]. Using chemically synthesized short sequences of RNA designed to interfere with the expression of specific genes, they image the cells affected by time-lapse photography. The recording is done in triplicates on 67 cells for two days for every gene, there being about 21 000 genes in the human genome.

Unimaginable Complexities

But it is the unimaginable entangled complexities of molecular genetics that ultimately defeats any hope of linking specific bits of DNA to any disease, let alone people’s behaviour, personality or other physical and mental attributes.

Just when we finally got used to thinking that a gene in molecular terms was the coding sequences (eventually read out as amino acid sequence of proteins) equipped with various regulatory egions for start and stop that would determine how actively the gene is expressed, when, where, and for how long, we need to think again. New research results from ENCODE revealed how such ‘genes’ are in bits dispersed throughout the genome, interweaving with bits of other genes [25]. As genes are intertwined, so are the functions. Multiple DNA sequences may serve the same function, and conversely the same DNA sequence can have different functions. It is futile to try and define a gene or a separable function for any piece of DNA. This is ultimately why genes for common diseases can never be found, much less behaviour or any other physical or mental attributes.

As an example of the unimaginable entangled complexities involved, science journalist Erica Check Hayden, writing in Nature [26] reported on just one protein, p53, which has been known to suppresses cancer by causing apoptosis (programmed cell death). It turns out to have a host of other functions.

Japanese researchers recently found that p53 helps to process several varieties of small RNA that keep cell growth in check. Even before that, it was clear that p53 sat at the centre of a dynamic network of protein, chemical and genetic interactions. Researchers now know that p53 binds to thousands of sites in DNA, and some of these sites are thousands of base pairs away from any genes. It influences cell growth, death and DNA repair. It also binds to numerous other proteins, which can modify its activity, and these protein–protein interactions can be tuned by the addition of chemical modifiers, such as phosphates and methyl groups. Through alternative splicing of the RNA transcript, the resulting p53 protein can take nine different forms, each of which has its own activities and chemical modifiers. Biologists are now realizing that p53 is also involved in processes beyond cancer, such as fertility and very early embryonic development. “In fact, it seems wilfully ignorant to try to understand p53 on its own. Instead, biologists have shifted to studying the p53 network, depicted in cartoons containing boxes, circles and arrows meant to symbolize its maze of interactions.” Hayden wrote [26].

So, the scenario portrayed at the beginning of this article is not going to happen because of any identifiable defective combinations of SNPs. But there is definitely a potential for big governments or terrorists to misuse the data, as for example, to create a biological weapon that would specifically target a population with certain SNPs and repeat sequences…

The Scylla and Charybdis of reductionism and mindless data proliferation

Ironically, Weinberg finds himself defending the old reductionist approach that has been abandoned for ‘systems biology’ [21]. Systems biology, he said, is undermining tried- and tested ways of doing and building science, which is reductionism , “the idea that complex biological systems can be understood by dismantling them into their constituent pieces and studying each in isolation.” This is clearly contradicted by the entangled complexities of gene functions described above.

Weinberg criticises the “massive data-generating projects” that “have yet to yield a clear consensus about how many somatic mutations are required to create a human tumour, and have given us few major breakthroughs in our understanding of how individual tumours develop.” The most ambitious large-scale venture involves assembling the many interacting signalling components within individual cells into wiring diagrams, dubbed ’hairballs’. “But these have yielded few conceptual insights into how and why cells and tissues behave the way they do.”

He concluded: “The stakes are high. The repercussions of major agencies shifting their funding allocations will be felt for a generation. The long term effects will be an increasing inability of many biological disciplines to attract the brightest young people.”

Weinberg’ s foreboding [21] sums up the dilemma of biology, caught between the Scylla of the discredited reductionist approach rejected by most biologists, and the Charybdis of data proliferation, the royal road to ‘systems biology’ widely mistaken for the alternative to reductionist biology [27] (see No System in Systems Biology and other articles in the series, SiS 21)

When the draft human genome sequence was announced ten years ago, I wrote [3]: “Bio-informatics suffers from the reductionist fallacy that knowledge will automatically arise once information is exhaustively listed. Molecular biology is suffocating from information overload. What we need is a quantum leap to a new paradigm for understanding the organism as a coherent whole. Otherwise, human genome research will remain a scientific and financial black hole that swallows up all public and private resources without any return either to investors or to improving the health of nations.”

This assessment is even more relevant today than it was then.

Article first published 27/07/10

References

The human genome at ten. Editorial, Nature 2010, 464, 649-50.
Collins F. Has the revolution arrived? Nature 2010, 464, 674-5.
Ho MW. Human Genome – the biggest sellout in human history. I-SIS TWN Report, 2000, https://www.i-sis.org.uk/humangenome.php; a slightly different version in I-SIS News 6, September 2000, https://www.i-sis.org.uk/isisnews/i-sisnews6.php#huma
Venter C. Multiple personal genomes await. Nature 2010, 464, 676-7.
Genome 10K Project, 4 November 2009, http://www.genome10k.org/
“Welcome Trust launches study of 10 000 genomes in UK”, Wellcome Trust Sanger Institute, 24 June 2010, http://www.sanger.ac.uk/about/press/2010/100624-uk10k.html
“Human genome at ten: 5 breakthroughs, 5 predictions”, Ker Than, National Geographic News, 30 March 2010, http://news.nationalgeographic.com/news/human-genome-project-tenth-anniversary/
Ho MW. The human genome map, the death of genetic determinism and beyond. Third World Resurgence 2000, 127-128, 14-18.
Ho MW. Death of the Central Dogma. Science in Society 24, 4, 2004.
Ho MW. Epigenetic inheritance, what genes remember. Science in Society 41, 4-5, 2009.
Ho MW. Nurturinig nature. Essay in honour of Ruth Hubbard. I-SIS scientific preprint. To appear in Myths of the DNA Paradigm: Essays ont he Uses and Auese of Genetic Explanation (Sheldon Krimsky and Jeremy Gruber eds). Council for Responsible Genetics, Washington, USA. https://www.i-sis.org.uk/nurturingNature.php
Green RE, Krause J, Briggs AW et al. (56 authors from 20 laboratories, lead author Svannte Pääbo lead author for the present study at Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany. A draft sequence of the Neandertal Genome. Science 2010, 328, 710-22
“Close encounters of the prehistoric kind”, Ann Gibbons, Science 328, 680-4.
Li JZ, Absher DM, Tan H et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science 2008, 319, 1100-4.
Jakobsson M, Scholz SW, Scheet P et al. Geneotype, haplotype and copy-number variation in worldwide human populations. Nature 2008, 451, 998-1003.
Ho MW. Celebrating the Uses of Human Genome Diversity. Science in Society 48.
“The great DNA letdown”, David Ewing Duncan, Fortune, 8 April 2010, http://tech.fortune.cnn.com/2010/04/08/the-decade-of-the-human-genome-where-are-the-fab-four/
The human genome at ten. Editorial, Nature 2010, 464, 649-50.
“A decade later, genetic map yields few new cures”, Nicholas Wade, New York Times, 12 June 2010, http://www.nytimes.com/2010/06/13/health/research/13genome.html?_r=1&th&emc=th
Paynter NP, Chasman DI, Paré G, Buring JE, Cook NR, Miletich JP and Ridker PM. Association between a literature-based genetic risk score and cardiovascular events in women. JAMA 2010, 303, 631-7.
Weinberg R. Point: hypothese first. Nature 2010, 464, 678. Robert Weinberg is researcher at Whitehead Institute and MIT, Cambridge
“Human genome at ten: science after the sequence”, Declan Butler, Naturenews, 23 June 2010, http://www.scientificamerican.com/article.cfm?id=human-genome-at-ten
Golub T. Couterpoint: data first. Nature 2010, 464, 678.
Neumann B, Water T, Hériché J-K, et al. Phenotypic prof iling of the human genome by time-lapse microscopy reveals cell division genes. Nature 2010, 464, 722-7.
ENCODE Project Consortium. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 2007, 447, 799-816.
“Life is complicated”, Erika Check Hayden, Nature 2010, 464, 664-7.
Ho MW. No system in systems biology. Science in Society 21, 46, 2004.

Comments are now closed for this article

There are 3 comments on this article.

Todd Millions Comment left 28th July 2010 14:02:03
More than just the datum mountians need re-evaluating. So months ago in a completely unrelated article posted at counterpunch,I came across the tidbit that-the reason the torturous single path rumback to an africain 'eve' is needed for mitocondrial founder work is that-the israeli researchers pre eminent in this work,want it to 'prove'(against all evidence,how-'christian'),the existence of four biblical matriarhs(un named)-with no mixing outside the faith! I'm trying to imagine what Stephen J Gould would have made of this. We have the cheetah model(of right body mass range-bonus),and good petigre records of royal pets and the time of the vidrulent distemper outbreak that left only a few-LUCKY and isolated survivors(NOT superior specimens),to get a good baseline for a mitocondrial range that is just as restricted as ours(compared to other primates),and the RATE of mutation. I expect that such work would push the 'eve 'bottle neck back to the 95% dissappearence of all large lowland animals,and show small inbred dispursed survivor groups. That this isn't done due to the proping up of the bronze age fairy tales of an even more inbred nest of hillbandits-is beyond amusing.Its disappointing. The other health link that should be explored is contamination of nutirent source stocks by gmo crops.Sic-Vit C for health supplements,food presevation and infection control are all made from sugar sources that have various bt insert and herbiside modifications-too the point that military physicians (and vets-who are paid only when there patients recover-unlike the doctors.),are having a duce of a time finding ascorbic acid stock pure enough to use intravenously-without sending the patent into fatal shock reactions. Since vit c isn't patentable and so no consultancy work or board of director positions are involved-this doesn't matter to various Health and pharma whore ministries and agencies,who are completely unaware of the problem anyway.Therefore there is no problem as far as they are concerned.How H1N1.

Rory Short Comment left 29th July 2010 01:01:28
Reductionism has its role. It is when we apply it to life that it becomes ineffective because it would seem that life cannot be better understood in reductionist as opposed to holistic terms. As a life form myself this makes absolute 'gut sense' to me. In fact I feel positively uplifted by this conclusion.

F.Z.Horusitzky Comment left 27th January 2012 20:08:06
The genomes of Neanderthals and modern humans are 99.84 percent identical. How did You find 99.84 % instead of 99.7 % ? Neither 99.84 nor 99.7 are in the Draft Sequence article of Green et al56 2010