Science in Society Archive

Mismatch of RNA to DNA Widespread

The latest episode in the unfolding saga of the fluid genome reveals widespread non-random changes in the RNA messages transcribed from genomic DNA Dr. Mae-Wan Ho

Genetic determinist paradigm unravelling

Nothing has done more to unravel the genetic determinist paradigm than post-genomics science, even though some of us had pointed out that genetic determinism was already discredited long before the Human Genome Project was conceived (see [1, 2] Genetic Engineering Dream or Nightmare and Living with the Fluid Genome, ISIS publications).  Blaming the genes is simply bad science catering to the worst human prejudices about our own failings as well as those of others. The new genetics of the fluid genome is decidedly not the “blank slate” set up as a straw man by attention-grabbing apologists of genetic determinism such as Steven Pinker [3]. It is a much more exquisite entangled relationship between heredity and environment that goes beyond any crude determinism, environmental or genetic; in which human beings actively shape their own lives as well as the lives of future generations (see [4, 5] Development and Evolution Revisited, and  Nurturing Nature, ISIS Scientific publications).

One major post-genomics development is epigenetics, a rapidly expanding discipline documenting the myriad ways in which an individual’s experience can mark and change genes, which are then passed on to the next generation (see [6] Epigenetic Inheritance - What Genes Remember and other articles in the series, SiS 41). For a brief spell, geneticists took comfort in the fact that the DNA sequence in the genome remains unchanged but were simply marked; it turned out not to be the case. Then they assumed that with few exceptions, the RNA messages transcribed from genes sequences in genomic DNA remain faithful copies that are translated into proteins. Not anymore; the revolution continues.  

A team of scientists at University of Pennsylvania led by Vivian Cheung presented the latest episode in the unfolding saga of the fluid genome. They found “widespread” mismatches between the RNA messages and the DNA gene sequences in the genome that the messages were copied from [7]. This is by no means the first time such mismatches were detected; as RNA-editing, a process that changes specific bases in the RNA transcript, has been known for many years. But most of the newly discovered changes are due to as yet unknown mechanisms, and are much more widespread than previously thought.

RNA and DNA sequences compared in the same individuals 

The researchers compared RNA sequence isolated from immortalized human B cells of 27 individuals who took part in the international HapMap and the 1 000 Genomes projects (see [8] Ten years of the Human Genome, SiS 48)  to their corresponding genomic DNA sequences. They chose to look at only monomorphc loci - genes that do not vary among individuals in the population - coding for proteins.

They detected 28 766 mismatch events - RNA differed from DNA - in 10 201 sites residing in 4741 known genes [7]. Each of the differences was observed in at least two individuals, many in B cells as well as in primary skin cells and brain tissues taken from a separate set of individuals; and also in expressed sequence tags from cDNA libraries of various cells types.

All 12 possible types of mismatches were observed. About 43 percent of the differences are transversions (purine to pyrimidine base and vice versa) and therefore cannot be the result of typical deaminase-mediated RNA editing.  The differences were non-random, as many mismatched sites were shared by multiple individuals and in different cell types, including primary skin cells and brain tissues. Peptides translated from the discordant RNA sequences were also detected with mass spectrometry.

These newly discovered RNA/DNA mismatches are in addition to those known to arise from errors in transcription and RNA editing carried out by enzymes that change mRNA after transcription: ADARs (adenosine deaminases that deaminate adenosine (A) to inosine (I) recognized by the translation machinery as guanosine (G), and APOBECs (apoliproprotein B mRNA editing enzymes, which edit cytosine (C) to uracil (U). Many A to G sites have been identified previously whereas C to U changes are rare.

Mismatches in more than one-third of genes and non-random

The 4 741 known genes with mismatches only included monomorphic loci, and represent 36 percent of the 13 214 well-characterized genes that have been covered by 10 or more RNA sequencing reads in at least one part of the gene, in two or more individuals. There are 6 698 A to G events, which could be the result of deamination, and 1 220 C to T differences, which could also be mediated by a known deaminase RNA editing enzyme, APOBEC1. However, that enzyme is not expressed in B cells. In addition, another 12 507 transversions (43 percent) could not have resulted from classic deaminase-mediated editing. These are referred to as RNA-DNA differences (RDDs) resulting from as yet unknown mechanisms.

An example is C changed to A on chromosome 12 in the myosin light chain gene MYL6, where 16 of the subjects showed C/C in their DNA (two copies of the gene) but A/C in their RNA sequences. Another example is A to C on chromosome 6 in the gene HSP90AB1 encoding a heat shock protein, where eight individuals have homozygous A/A DNA genotype but have A/C in their RNA.

Altogether, 8 163 (80 percent) of the sites were found in at least 50 percent of the 19 individuals with well-characterized genes.  Some of the sites were found in nearly all of them.

Each person has on average 1 065 RDDs (range 282 to 1 863), which show no significant correlation with ADAR expression, confirming that the RDDs did not result from known RNA editing. The sites are not evenly distributed across the genome. Chromosome 19 has the most sites, whereas chromosome 13 has the fewest, and the pattern remains after correcting for differences in size and gene density among chromosomes.

RDD sites are significantly enriched in genes that play a role in helicase activity (motor proteins that separate strands of DNA), and in protein and nucleotide binding.

Of the 10 210 site, 44 percent (4 453) are in coding exons (10 percent in the last exon), 4 percent (386) are in the 5’ untranslated regions and 39 percent (3 977) in the 3’ untranslated regions. Sites also tend to cluster, 1 059 (10 percent) are adjacent, and 26 percent are within 25 bp of each other.

In view of this surprising finding, mapping studies that have hitherto focussed on identifying DNA variants as “disease susceptibility alleles” may be missing a lot (which may explain why the exercise has been singularly unsuccessful [8]). Cheung and colleagues believe they need to map also RNA sequence variants that are not in the DNA sequences. In my view, all such mapping exercises are of limited utility simply because the genome is fluid and dynamic; and hence the proper focus of post-genomics research is genomic dynamics, the study of how and why the genome changes; quite possibly in relation to what people experience, or what they choose to do.

Article first published 23/11/11


  1. Ho MW. Genetic Engineering Dream of Nightmare? The Brave New World of Bad Science and Big Business, Third World Network, Gateway Books, MacMillan, Continuum, Penang, Malaysia, Bath, UK, Dublin, Ireland, New York, USA, 1998, 1999, 2007 (reprint with extended Introduction).
  2. Ho MW. Living with the Fluid Genome, ISIS & TWN, London and Penang, 2003.
  3. Pinker S. The Blank Slate, The Modern Denial of Human Nature, Penguin Press, Science, 2002.
  4. Ho MW. Development and Evolution Revisited. In Handbook of Developmental Science, Behavior and Genetics (K Hood, C. Halpern, G Greenberg and R. Lerner, eds.), Blackwell Publishing, New York, 2009.
  5. Ho MW. Nurturing Nature. In Genetic Explanations Sense and Nonsense (S. Krimsky and J Gruber, eds.), Harvard University Press, Harvard, 2011,
  6. Ho MW. Epigenetic inheritance, what genes remember. Science in Society 41, 4-5, 2009.
  7. Li M, Wang IX, Li Y, Bruzel A, Richards AL, Toung JM and Cheung VG. Widespread RNA and DNA sequence differences in the human transcriptome. Science 2011, 333, 53-58.

Got something to say about this page? Comment

Comment on this article

Comments may be published. All comments are moderated. Name and email details are required.

Email address:
Your comments:
Anti spam question:
How many legs on a cat?

There are 6 comments on this article so far. Add your comment above.

Kaviraj Comment left 24th November 2011 23:11:14
I have always said that the genetic material is an interactive system - input=output. This is evidenced by the instances of brain damage in vaccinated children and recorded extensively by Harris Coulter PhD in his book on the subject "Vaccination. The Assault on the American Brain." If the genes are no interfered with, the result is much healthier children who suffer 80% less acute diseases. This automatically translates to less costs to healthcare. Genetics as a deterministic system has never held any proper grounds, regardless the claims to the contrary. Just as the germ theoary is a fallacy which robs the patient of his responsibilities and makes him easier manipulable to Big Medicine's machinations, the genetic model has been adopted as explanation for diseases not associated with any germs. For me, it is clear that such fallacious theoretical models are promoted to gain more control and not with the goal of scientific discovery.

harry Comment left 24th November 2011 23:11:41
remebering school time in biology lessons 25 years ago I asked my teacher how evolution could do so countless modifications based only on "survival of the fittest". Thanks god there are scientists who surch for and answers that makes more sense. Thank you for this opurtunity. sorry for my english

Ken Conrad Comment left 28th November 2011 10:10:09
This question is somewhat off topic however if genetics are as fluid and dynamic as you point out, what are the implications with respect to genetic blueprinting of bacteria and its use in epidemiology? Ken Conrad

Dr R K S Rathore Comment left 28th November 2011 22:10:03
Since the discovery of Mendel on the principles of inheritance major achievements in the development of new varieties in plants & animal have been maid. Therefore, the role of genes on DNA can not be ignored. Our understanding on epigenetic inheritance is adding more fuel to the fire.Many thanks to show the other side of the coin.

Mae-Wan Ho Comment left 28th November 2011 22:10:51
Good question ken conrad. Geneticists have indeed found the same rapid 'evolution' due to horizontal gene transfer and recombination, if not directed mutations, for example in the recent E. coli outbreak in Germany. They are so rapid and profuse that some of us suspect artificial genetic engineering may have something to do with it.

medical transcription company Comment left 8th January 2012 01:01:27
Thanks for the post. It was really helpful to solve my confusion. Medical Transcription Company