Science in Society Archive

Life after the Central Dogma

The biotech industry was launched on the scientific myth that organisms are hardwired in their genes, a myth thoroughly exploded by scientific findings accumulating since the mid 1970s and especially so since genome sequences have been accumulating (see Living with the Fluid Genome, by Mae-Wan Ho ).

We bring you the latest surprises that tell you why our health and environmental policies based on genetic engineering and genomics are completely misguided; and more importantly, why the new genetics demands a thoroughly ecological approach.

How to Keep in Concert

One of the biggest puzzles of the fluid genome is why multiple copies of a gene scattered throughout the genome can be kept so nearly identical, which may be good for the organism. But the mechanism responsible has its downside in converting healthy genes to defective ones when cells are stressed. Dr. Mae-Wan Ho reports

The mystery of perfect copies

‘Multigene’ families are families of genes that serve the same function and are almost identical copies of one another. Multigene families exist in the genome of all living organisms, and are present either in blocks of repeats, or in single copies dispersed throughout the genome.

One question that has preoccupied geneticists right from the first is how the multiple copies of gene sequence remains so uniform within a species, which is out of all proportion to expectations based on the rate of random mutations that strike most other parts of the species’ genome, and much more so compared to the same gene sequence present between species. The members of multigene families seem to evolve ‘in concert’ within a species.

The ribosomal RNA (rRNA) genes - required for protein synthesis in the ribosomes within the cell - are the best-studied examples of ‘concerted’ evolution in eukaryotes (organisms including human beings, whose genomes are enclosed within a nucleus); and gene conversion has been proposed as a mechanism, especially for genes that are dispersed throughout the genome. Gene conversion is a process whereby the sequence of one gene converts that of another in the genome, so the end result is a closer resemblance between them.

Large numbers of rRNA genes are present in eukaryote, typically more than 100, and in some cases more than 1 000; and the size of the repeated unit also tends to be very big. This makes precise analysis very difficult. The repeat unit of human rRNA genes, for example, is about 43kb; and there are five blocks of tandem repeats of about 100, each on a different chromosome.

A closer look

Microbiologist Liao Daiqing of the University of Sherbrooke in Quebec, Canada, compared the sequences of multiple rRNA genes within the genome of 12 bacteria that have multiple copies of the rRNA genes. The genes for the three rRNA molecules (23S, 15S and 5S) found in the ribosome are typically linked together and transcribed in a single unit called an operon in prokaryotes. The length of these three rRNA genes is ~2 900bp (23S), ~1 500bp (16S), and ~120 bp (5S), and their sizes as well as sequences are well conserved between different prokaryotic species. The multiple rRNA operons (muti-gene units under the same transcription control) are generally dispersed throughout the prokaryotic genome. Liao analysed the rRNA genes and their immediate flanking sequences in 19 completely sequenced genomes, but seven of the genomes surveyed contain only one copy of each rRNA gene.

He found striking sequence homogeneity of each individual rRNA gene family within a species, in contrast to the divergence of gene sequences between species.

Within a genome, evidence of gene conversion was found throughout the entire length of each individual rRNA genes and their immediate flanking regions. Individual conversion events, however, convert only a short sequence tract, and the conversion partner can be any gene within the gene family in the genome. He confirmed that gene sequences undergo much slower divergence than their flanking sequences, and any homogeneous flanking regions that exist may have been incidental co-conversion with the gene sequence.

The average divergence (difference) among the seven 16S rRNA genes present in E. coli is 0.0055 per site, whereas the average divergence between the 16S RNA genes in E. coli and its close relative H. influenze is 0.1325, or 24 times greater. The same applies to the 23S and 5S rRNA genes. No sequence heterogeneity was detected for multiple copies of 23S, 16S or 5S in Aquifex aeolicus, Chlamydia trachomatis, Haemophilus influenze, Helicobacter pylori, Methanobacterium thermoautotrophicum and Synechocystis PCC6803. Five of these six species have only two rRNA operons, whereas there are six operons in H. influenzae. There are 10 and 7 rRNA operons in B. subtilis and E. coli, but the rRNA genes in these two species also display remarkable sequence homogeneity.

Obvious sequence heterogeneity was found for the intergenic ‘spacer’ sequences between 16S and 23S genes in B. subtilis, E. coli, H. influenzae and T. pallidum. This is mainly due to the presence or absence of tRNA (transfer RNA) genes or the presence of different tRNA genes in this intergenic region. The contrast of homogeneity in the gene sequences to heterogeneity in the intergenic spacers implies that concerted evolution does not reflect gross replacement of one operon with another; rather it is a gradual, region-by-region homogenisation process.

Individual conversion tracts appear to be short, apparently less than 500bp, similar to those observed in other organisms.

How genes may convert

Several mechanisms can lead to sequence conversion. The first is via reverse transcriptase (RT) of a rRNA sequence into complementary rDNA, which is then inserted in place of other rRNA genes in the genome. The second mechanism involves recombination between different rRNA genes during DNA replication, so they end up with the same sequence or more similar sequences. The third mechanism involves the ‘invasion’ of one gene by the single stranded DNA of another gene to form a hybrid duplex, followed by DNA repair to remove the mismatch.

The first two mechanisms are considered unlikely in prokaryotes. Although RT-mediated gene conversion appears to occur in the eukaryote yeast, RT activity cannot be detected in many different types of cells including E. coli. Unequal reciprocal recombination can in principle account for homogenisation of tandemly repeated genes. However, that could not satisfactorily explain the remarkable heterogeneity of sequences flanking the rRNA genes. Furthermore, ectopic recombination between repetitive sequences in different parts of the genome can result in sequence deletion, inversion or translocation and such drastic genomic changes lead to genome instability.

So that leaves gene conversion via heteroduplex formation, probably mediated by the complex bacterial enzyme RecBCD that controls recombination at particular ‘Chi’ (pronounced "Kye") recombination hotspots, with the sequence GCTGGTGG (see box). The Chi element is one of the most abundant repeated sequences in the E. coli genome. Chi-like sequences are frequently found within the 16S and 23S rRNA genes and their vicinities. For example, the sequence stretch GCTGGCGG near the 5’ end of the 16S rRNA gene differs from Chi by only one nucleotide, and this change does not appear to affect its function. This Chi sequence is conserved in all bacterial 16S rRNA genes. Although RecBCD/Chi system may not operate in all the species, similar recombination machinery may be responsible.

A universal gene converter?

The E. coli RecBCD enzyme is a multifunctional protein complex (330 kDa) containing three subunits, the products of the recB, recC, and recD genes. This enzyme displays four distinct activities: nuclease, helicase, ATPase, and site-specific recognition of the DNA regulatory sequence, ‘Chi’. RecBCD enzyme is responsible for the seemingly disparate functions of DNA degradation and repair of the bacterial chromosome. The former function is achieved by the combined action of its helicase and nuclease activities, whereas a recombinationally activated form accomplishes the latter, after RecBCD interacts with Chi. RecBCD is a principal component of the main pathway for homologous genetic recombination in E. coli. Structural or functional anlogs of the RecBCD enzyme are present in many bacteria.

The nuclease activity protects the cell from invasion by viral DNA, although bacteriophages (bacterial viruses) that infect E. coli have developed strategies to overcome the nuclease activity by producing proteins that bind to the RecBCD to inhibit its activity or that caps the end of the genome, to prevent RecBCD from entering it.

‘Chi’ is a DNA sequence of eight nucleotides (5’-GCTGGTGG-3’) that stimulates the frequency of recombination in its vicinity by 5 to 10 fold over background levels. It was originally discovered as a mutation in l phage that protected its genome from degradation by RecBCD enzyme. The effect of Chi is highly oriented, with the region of enhanced recombination extending downstream of the 5’ end of the Chi sequence, decreasing by a factor of two for every 2.2 to 3.2 kilobases, returning to background levels 10 kilobases downstream. All recombination stimulated by this site requires the activity of the RecBCD enzyme, and only if the enzyme approaches Chi from the 3’ side.

A model involves Rec BCD enzyme entering a dsDNA end to unwind the duplex, while preferentially degrading the strand corresponding to the 3’-terminus at the point of entry. Single-strand binding (SSB) protein binds the single stranded ssDNA produced. When RecBCD encounters a Chi sequence, however, the 3’ to 5’ nuclease activity is attenuated and a weaker 5’ to 3’ nuclease activity is activated on the opposite strand. Following the interaction with Chi, degradation of the strand corresponding to the 3’ end is attenuated at least 500-fold. This attenuation of nuclease activity is manifest until the enzyme dissociates from the DNA, explaining the elevated recombination frequency downstream of Chi sites. RecBCD enzyme facilitates the loading of RecA protein onto the ssDNA produced by the continued translocation and unwinding of the DNA molecule beyond the Chi site. The RecA protein-coated ssDNA filament then invades a homologous DNA molecule, and converts it by a DNA repair mechanism that removes the mismatch in the invaded copy.

This mechanism is believed to be responsible for 80% of recombination events following conjugation. Any dsDNA breaks, similarly, is repaired recombinationally by RecBCD. Repair is facilitated by the abundant presence of Chi sites, occurring approximately once every 4.6 kilobases in the E. coli genome; with 75.5% of the sites oriented toward the origin of replication.

Gene conversion in health & disease

Evidence for gene conversion via heteroduplex formation has emerged in other bacteria and in yeast. Analysis of the RNU2 gene in various human populations reveals that repeats within an individual tandemly repeated array are more homogeneous than between different arrays, while the intergenic flanking regions are not homogeneous, suggesting that gene conversion is involved instead.

Chi-like sequences have been found in many eukaryotic genomes and are suspected to be involved in gene conversion events, for example, within the MHC (Major Histocompatibility Complex), a complex of around 100 gene in vertebrates, include the extremely polymorphic (variable) cell surface proteins called HLA in humans and H-2 in mice, which provide immunological markers for ‘self’, and are involved in immune response against ‘nonself’, including transplants.

Alec Jeffreys and Celia May at Leicester University examined human sperm for evidence of gene conversion. The formation of germ cells – egg and sperm - during meiosis is the usual point in the life-cycle of higher organisms when chromosomes pair up, ‘cross-over’ and exchange parts, thereby shuffling the genes they inherit from each parent. But it appears that instead of an equal exchanging of parts at the cross over points, there is an unequal conversion of one allele by the other.

Jeffreys and May first concentrated on a recombination hotspot DNA3 located in the MHC, which is surrounded by single nucleotide polymorphism (SNP), with many men heterozygous for multiple SNPs. They found evidence of gene conversion - 1.3-3.4 x 10-3 per sperm – that was two to three times higher than the rate of crossover. All conversions involve the transfer of short stretches of DNA (300bp to 1091 bp). Conversion rates declined rapidly with distance and defined a very steep gradient extending in each direction from the centre of the hotspot.

Another crossover hotspot DMB2 in the MHC was much less active than DNA3, but the pattern of gene conversion was very similar. A third crossover hotspot is the gene SHOX in the pseudo-autosomal pairing region PAR1 on the sex chromosomes. The crossover rate is much higher (3.7x10-3) per sperm, although the density of SNP is low. Again, there is evidence of gene conversion involving short tracts of DNA.

The mean length of conversion tracts probably lies in the range of 55-290bp. They estimate that somewhere between 80% and 94% of recombinations at hotspot DNA3 are gene conversions rather than reciprocal cross-overs.

Similar results have been observed earlier in mice. The number of crossovers during meiosis is tightly regulated to one to two per pair of chromosomes in mice, and their distribution is not random, there are recombination hot and cold regions. Researchers in the Institute of Human genetics, Montpellier, France, found a high frequency of gene conversion in the region of highest crossover density. They found 16 gene conversion events among 6 000 molecules of sperm DNA, corresponding to a frequency of 2.7 x 10-3. Most of the gene conversion events involve less than 540bp tracts.

Gene conversion is increasingly implicated in human disease, in which the disease-causing mutations appear to be copied from a closely related pseudogene (a mutated gene that is no longer functional) in the genome. Cases attributed to Chi sequences include the T4 cationic trypsinogen gene associated with pancreatitis, the b-crystallin gene CRYBB2 in a dominant form of cataracts, the CYP21B gene responsible for steroid 21-hydroxylase deficiency and congenital adrenal hyperplasia and Von Willebrand disease (VWD), the commonest inherited bleeding disorder. Such pathological gene conversions may be linked to stress, and resemble the controversial phenomenon of ‘directed’ mutations found in stressed and starving bacterial cells (see "To mutate or not to mutate", this series).

Avoiding stress may be much more important for health than inheriting good genes.

Article first published 20/09/04


  1. Liao D. Gene conversion drives within genic sequences: concerted evolution of ribosomal RNA genes in Bacteria and Archaea. J Mol Evol 2000, 51, 305-17.
  2. Arnold DA and Kowalczykowski SC. RecBCE helicase/nuclease. Encyclopaedia of Life Sciences, Macmillan, 1998.
  3. Martinsohn J Th, Sousa AB, Gujethlein LA, Howard JC. The gene conversion hypothesis of MHC evolution: a review. Immunogentica 1999, 50, 168-200.
  4. Dorak MT. Common terms in evolutionary biology and genetics.
  5. Jeffreys AJ and May CA. Intense and highly localized gene conversion activity in human meiotic crossover hot spots. Nature Genetics 2004, 36, 151-6.
  6. Guillon H and de Massy B. An initiation site for meiotic crossing-over and gene conversion in the mouse. Nature Genetics 2002, 32, 296-9.
  7. Küppers R and Dalla-Favara R. Mechanisms of chromosomal translocation in B cell lymphomas. Oncogene 2001, 20, 5580-94.
  8. Chen J-M, Raguenes O, Ferec C, Deprez PH and Verellen-Dumoulin C. A CG>CAT gene conversion-like event resulting in the R122H mutation in the cationic trypsinogen gene and its implication in the genotyping of pancreatitis. J Med Genet 2000, 37 (
  9. Virinder Sarhadi V, Reis A, Jung M, Singh D, Sperling K, Singh JR and Bürger J. A unique form of autosomal dominant cataract explained by gene conversion between b-crystallin B2 and its pseudogene. J Med Genet 2001, 38, 392-6.
  10. Amor M, Parker KL, Globerman H, New MI and White PC. Mutaion in the CYP12B (Ile-172-Asn) causes steroid 21-hydroxylase deficiency. Proc Natl Acad Sci USA 1988, 85, 1600-4.
  11. Surdhar GK, Enayat MS, Lawson S, Williams MD and Hill FGH. Homozygous gene conversion in von Willebrand factor gene as a cause of type 3 von Willebrand disease and predisposition to inhibitor development. Blood 2001, 98, 248-50.

Got something to say about this page? Comment

Comment on this article

Comments may be published. All comments are moderated. Name and email details are required.

Email address:
Your comments:
Anti spam question:
How many legs on a duck?