Science, Society, Sustainability
The ISIS website is archived by the British Library as UK national documentary heritage ISIS members area log in ISIS facebook page ISIS twitter page ISIS youtube channel ISIS vimeo channel

ISIS Report 25/01/12

Mystery of Missing Heritability Solved?

Genome-wide scans for genes that determine susceptibility to common diseases have yielded little because most of those genes do not exist; disease genomics is a science fantasy that wastes time and money while the health of the nation deteriorates Dr. Mae-Wan Ho

A fully referenced version of this article is posted on ISIS members website and is otherwise available for download here

Please circulate widely and repost, but you must give the URL of the original and preserve all the links back to articles on our website

Where are all the promised genes?

When the human genome sequence was announced in 2000, President Clinton said it would “revolutionise the diagnosis, prevention and treatment of most, if not all human diseases.” Ten years on, and Fortune magazine called it: “The great DNA letdown”. A poll by science journal Nature returned the verdict: “the hoped for revolution against human disease has not arrived.”

That is as some of us had predicted in 2000 ([1] Human Genome -The Biggest Sellout in Human History, ISIS TWN report) and before [2] Genetic Engineering Dream or Nightmare, ISIS publication).

The human genome project has generated reams and reams of data since its inception, but there is little progress even in the apparently simple task of finding the genes responsible for susceptibility to common diseases (see [3] Ten years of the Human Genome, SiS 48).  

Top geneticists now admit that human genetics has been haunted by the mystery of “missing heritability” of common traits. Genome-wide association studies (GWAS, see Box 1) – the current gold standard for the most exhaustive gene hunt that can be performed - have identified ~2 000 genetic variants associated with 165 common diseases and traits; but these variants appear to explain only a tiny fraction of the heritability in most cases [4, 5].

Box 1

Genome wide association studies

Genome wide association studies (GWAS) involves rapidly scanning markers across the complete genomes of many people to find associations of genetic variants to particular diseases or traits. Typically, thousands or tens of thousands of individuals are scanned, simultaneously for up to 550 000 single nucleotide polymorphisms (SNPs) – common differences in single nucleotides at specific sites across the human genome with frequencies > 5 % - using DNA microarrays (chips).

Heritability is technically the proportion of the variability of the trait in a population due to genes. Variability is measured statistically as variance, the sum of the squared individual deviation from the population mean. Heritability is commonly referred to as the ‘genetic component’ of the variance as opposed to the proportion due to the environment, the ‘environmental component’. Note that heritability refers to the variation, and not to the trait itself. Heritability changes according to the environment. It is not uncommon for the heritability of traits such as milk yield or height of a plant from the same genetic strain to change substantially from one year to the next. However, there is a tendency for some scientists as well as the popular media to mistakenly assume that any trait with a large heritability means it is predominantly genetically determined, which is definitely not the case.

No genes for common diseases?

Nevertheless, the hunt for genes determining susceptibility to common diseases has continued for decades, spurred on over the past 5 years by the availability of DNA chips that allow genome wide scans for more than 500 000 SNPs simultaneously.

Eric Lander and his team at Board Institute MIT & Harvard, Harvard, Massachusetts in the United States are among those suggesting that much of the missing heritability never existed in the first place [5]. They base their argument on biometrical genetics, a mathematical discipline that deals with continuously varying traits, such as crop yields, height, body mass, IQ scores, or disease states that fall on a continuum, as for example, blood glucose, blood pressure, or some measure of disease severity.

I should point out that one arrives at precisely the same conclusion given the pervasive epigenetic influences of the environment on development [1-3], which have been abundantly confirmed and extended since the human genome was sequenced (see [6] Death of the Central Dogma and other articles in the series, SiS 24; [7] Epigenetic Inheritance - What Genes Remember and other articles in the series, SiS 41; [8] Nurturing Nature, ISIS scientific publication).

This convergence of molecular and biometrical genetic analyses is the most conclusive refutation of the reductionist, genetic determinist paradigm of linear causation from genes to traits that had made the Human Genome Project seem such a compelling undertaking; only to thoroughly discredit it as a result (see [9] Living with the Fluid Genome, ISIS publication).

We now know that much of the variation may come from individual experiences of the environment; furthermore, those experiences can mark and change genes, influencing the development of the individual and in many case, the individual’s offspring. Genes and environment operate in enormously complex feed-forward and feed-back networks that straddle generations. This fundamentally circular causation between genes and environment means that genetic and environmental contributions are inseparable, and any attempt at assigning linear effects to single genes is doomed to failure.

We shall see how genetic determinism is finally unravelling within the heart of the genetics establishment, beginning with the findings of Lander’s team with regard to common disease traits and continuing with the intelligence and IQ debate ([10] No Genes for Intelligence in the Human Genome, SiS 53).

The genetic component has been greatly over-estimated

Specifically, Lander and colleagues show that the missing heritability arises from an overestimate of total heritability (the genetic component of the variation in the trait) which implicitly assumes that no gene interactions (or gene environment interactions) exist, an assumption clearly unjustified. Including gene interactions gives a much smaller total heritability. In short [5], “missing heritability need not directly correspond to missing variants, because current estimates of total heritability may be significantly inflated by genetic interactions.”

Actually, gene interactions do belong to the ‘genetic component’ of heritability. In biometrical genetics, ‘broad sense heritability’ H2 includes additive genetic effects as well as effects due to gene interactions and any non-additive, nonlinear effects due to genes. But broad sense heritability is very difficult to determine. In practice, only the ‘narrow sense heritability h2 (the additive, linear effects due to genes) can be estimated.  Narrow sense heritability applies strictly to ‘polygenic’ traits due to many genes each with a small additive effect, and is implicitly assumed to apply to all polygenic traits, beginning with the pioneers of biometrical genetics (see later).

Geneticists therefore define the proportion of heritability of a trait explained, pexplained, as a ratio of phenotypic variance explained by the additive effects of known genetic variants, h2known, to the phenotypic variance that can be attributed to the additive effects of all variants, including those not yet discovered, h2all (Equation 1).

pexplained = h2known/ h2all                                                              (1)

The nominator h2known can be calculated directly from the measured effects of the variants, but the denominator h2all must be inferred indirectly from population data.

The prevailing view among geneticists is that the missing heritability is due to additional variants yet to be discovered, either common alleles with moderate-to-small effects or rare alleles (frequency < 1 %) with large effects [4, 5].

The other possibility, favoured by Lander’s team, is that the missing heritability does not actually exist, and is an artefact arising from the total heritability h2all being over-estimated in the first place, by ignoring the impacts of gene interactions.

For example, Crohn’s disease (inflammatory disease of the bowel) has so far 71 risk associated loci identified. Under the usual assumption of additive effects, these loci explain 21.5 % of the estimated total heritability. Genetic interactions could account for the remaining nearly 80 % missing heritability. Why then, has genetic interaction never been detected in population analyses? Lander and colleagues point out that to detect gene interactions for Crohn’s disease may require sample sizes in the range of 500 000 individuals, which is rarely attained.

But gene interaction, or epitasis, is well-known and pervasive. It is epitomised by the findings of  project ENCODE (Encyclopedia of DNA elements) organised by the US National Human Genome Research Institute, in which a consortium of 35 research groups went through 1 % of the human genome with a fine-tooth comb to find out exactly how genes work [11]. They discovered that [12] “genes appear to operate in a complex network, and interact and overlap with one another and with other components in ways not fully understood.” Essentially, the ‘gene’ as a well-defined, separate unit of structure or function no longer applies.  Instead, genes exist in bits strewn across the genome, structurally and functionally intertwined with other genes.

How phantom heritability arises

Lander and colleagues point out [5] that in calculating the explained heritability (Eq. 1), the numerator h2known is estimated based on the effects of the individual genetic variants. The problem comes in estimating the denominator h2all. Because not all the variants are known, their contribution must be inferred based on phenotypic correlations in a population. This gives an apparent heritability, h2pop. And the missing heritability is then estimated by assuming that h2all = h2pop.

However, there is no guarantee that h2all = h2pop, unless the trait is strictly additive, and neither gene-gene interaction nor gene-environment interaction exists. For traits with gene interaction, which would realistically apply to practically all common traits and diseases, h2pop may significantly exceed h2all. In that case, even when all the variants for the trait have been identified, the missing heritability pmissing will not diminish to zero, instead, it converges to 1 – (h2all/ h2pop), which Lander and colleagues refer to as’phantom heritability’, pphantom.

Simple model shows how genetic interactions create phantom heritability

To show how genetic interactions create phantom heritability, Lander and colleagues introduced a simple model in which a trait depends on input from more than one processes, Phantom heritability – that which remains missing even when all genetic variants have been identified – grows quickly with the number of inputs, approaching 100 % of the total variation.  For Crohn’s disease, for example, just 3 inputs are sufficient to account for 80 % of the phantom heritability.

Similarly, gene-environment interactions can produce additional phantom heritability, (as indeed other unaccounted sources such as epigenetic effects).

Twin studies deeply flawed

The typical framework for analysing human traits depends on a systematic denial of epistasis, assuming that genes act in a purely additive way, each gene contributing a small amount to the trait, which is summed up depending on how many of those genes are present.

One measure of apparent heritability h2pop (ACE) assumes additive genetic variance, as well as common environmental and unique environment variance components, and a usual definition for apparent heritability is h2pop (ACE) = 2(rMZrDZ), where rMZ and  rDZ are the phenotypic (measured trait) correlations between monozygotic twins (sharing 100 % of their genes) and dizygotic twins (sharing 50 % of their genes); while the environment they share is assumed to be common, including the maternal environment.  

But realistically,

h2pop (ACE) = h2all + W                                                              (2­)

where W represents the sum of variances due to all possible higher order additive and non-additive interactions between genes. The crucial point is that if there are any gene interactions, then W > 0 , so h2pop (ACE) overestimates h2all.

Unfortunately, there has been no way to estimate W from population data. In most human studies, the solution is to assume there is no gene interactions, in which case W = 0. Thus, twin studies systematically overestimate the genetic contribution to disease and other traits, most notably, and controversially IQ (see [10]).

Additive assumption fundamental to biometrical genetics

Lander and colleagues are not the first to expose the fundamentally flawed assumptions of classical biometrical genetics. Helen Wallace of UK-based GeneWatch has published a similar critique 5 years earlier [13]: gene-gene and gene-environment interactions could reduce the calculated heritability considerably below that predicted by the standard twin-studies method based on pioneering British geneticist Ronald Fisher’s 1918 assumption that genes act additively.


The major implication is that the hunt for susceptibility genes is practically useless. Indeed, Lander and colleagues [5] and others [4] see the primary purpose of medical genetics as the identification of underlying pathways and processes analogous to the hunt for mutants in model organisms; and not in “explaining heritability” or “predicting personalized patient risk.”

But there are much wider implications on health policies. Governments and companies have been keen to set up whole genome biobanks ever since the human genome sequence was announced (see [14] Human DNA 'BioBank' Worthless and other articles in the series, SiS 13/14). The UK government is now pushing to let companies gain access to public health records to drive discovery in disease genomics [15]. But if the genetic contribution to disease is largely a phantom, what is the point of integrating whole genome sequences with electronic medical records as most of this information is likely to be clinically useless for most people [14, 16]?

There are vested interests that want to keep the genetic myth alive. As Wallace points out, the evidence she presented in 2006, and Lander and colleagues presented in 2011 has had no impact on gene testing companies such as Illumina and 23andMe, which continue to claim that everyone will have their genome mapped or sequenced in future, at birth or as a routine part of healthcare. The Director of the National Institutes of Health Francis Collins has echoed these claims in his populist book The Language of Life [17]. Wallace is convinced, as I am, that [16] whole genome sequencing of everyone, leading to the “prediction and prevention” of disease, is a science fantasy and a massive waste of money.”

A fraction of the resources divested into much needed primary health care and disease prevention through nutritional and other environmental/social interventions will do infinitely more to improve the health (as well as brain power) of the nation.

There are 2 comments on this article so far. Add your comment
Todd Millions Comment left 29th January 2012 09:09:13
An old item,but one seldom broached on this-Do these studies consider that dad may not in fact be father?Even were clan Millions not Yorkies,the amusing fact is,the usual overall factor applied too this is one in seven-aren't.So family studies not based on maternal lines could be quite off. Some what like the lifespan studies that show married men live longer.The flaw in these is the men are ask how old they are.Birthdate derived data,tend too show same span.But for some reason too married men,it tends too SEEM- MUCH longer.
MRS Carol Jewell Comment left 19th February 2012 14:02:02
Simplyfying how we detect damage (ill health) to the body would go a long way in preventing chronic illnessess. Today doctors, through their own ignorance, have taken to "blame the patient" culture like a pack of wolves. But we know that in the main today, doctors are not doctors, they are greedy, selfish and ruthless business men. WE hear much too much about synthetic science and far too little about natural science. Natural science of course is about the importance of cellular activity and how to put it right, when it has gone wrong . But then if we did that (which is entirely how it should be) we would surely cure the patient...............but rob the greedy pimps masquerading as doctors/ physicians/ consultants/ etc etc etc out of their lucrative earnings. How do these people sleep at night?

Comment on this article

All comments are moderated. Name and email details are required.

Email address
Your comments

Anti-spam question - just to prove you are human

How many legs does a duck have?

Recommended Reading

sitemap | contact ISIS

© 1999-2016 The Institute of Science in Society