ISIS Report - March 21 2001
E. coli 0157:H7 and Genetic Engineering
The food-borne pathogen E. coliO157:H7 has been sequenced. Dr. Mae-Wan Ho asks whether genetic engineering might have contributed
towards its emergence.
E. coli 0157:H7 is a food-borne pathogenic strain of bacteria that
emerged in the United States in the 1980s, and is now responsible for some
75 000 cases of infection annually in that country. It has also been
responsible for major outbreaks in Scotland, Japan and elsewhere since.
The first outbreak was associated with infected hamburgers in 1982. The
strain responsible, EDL933, isolated from ground beef in Michigan, has
been studied as a reference strain. The complete sequence of its genome
has recently been determined (1,2), and its closest relative turns out to
be the laboratory strain K-12 MG1655. E. coli O157 has acquired
shiga toxin genes (from the bacteria Shigella) and plasmids
containing virulence factors by horizontal gene transfer.
The two strains, O157 and K12 share a common backbone with almost
identical gene order. The 4.1 million base pairs in the genomes can be
lined up side by side along their lengths except at one point where the
O157 genome is reversed. Inversions around the starting point of
replication are common in bacterial genome evolution.
Scattered roughly evenly within each genome are hundreds of sections of
DNA that are unique to one or the other: 1.34 megabases coding for 1,387
genes in the O strain, the O islands; and 0.53 megabases coding for 528
genes in the K strain, the K islands. Much of the DNA in O and K islands
has been acquired by horizontal gene transfer.
There are 106 O and K islands present at the same locations in the
backbone. Only a subset of islands is associated with elements likely to
be autonomously mobile. Most islands are horizontal transfers of relatively
recent origin from a donor species with a different intrinsic base
Of the 1 387 acquired genes in O157, 40% (561) can be assigned a
function, another 338 genes of unknown function lie within clusters that
are probably remnants of phage (bacterial virus) genomes. About 33%
(59/177) of the O islands contain only genes of unknown function. Many
classified proteins are related to proteins from other E. coli
strains or related enterobacteria known to be associated with virulence,
and include alternative metabolic capacities, prophages (integrated
genomes of bacterial viruses) and other new functions.
There are 3574 protein-coding regions in the backbone, and the average
nucleotide identity between O157 and K12 is high: 98.5%. Of these regions,
89% are of equal length and 25% encode identical proteins. Some
chromosomal regions are more different (hypervariable) than the average,
but they encode a comparable set of proteins at the same relative
chromosomal positions. In the most extreme case (YadC), the proteins from
the two strains exhibit only 34% identity. Four such loci encode known or
putative biosynthesis operons of fimbrial proteins used in attachment to
host cells. Another code for a restriction/ modification system that
breaks down foreign DNA.
From the extent of genetic differences between the strains, the authors
estimate that E.coli O157:H7 and K12 shared a common ancestor
about 4.5 million years ago. This estimate is highly questionable,
however, as are all similar estimates.
Such estimates are based on the so-called molecular clock hypothesis,
which assumes a steady, neutral (nonadaptive) random accumulation of
genetic difference per unit time. One assumption is one percent per
million years. But this is notoriously unreliable, as we now know that
mutational changes vary directly in proportion to the number of DNA
replication cycles. So, organisms with short life-cycles accumulate
changes faster than those with long life-cycles. There are also many fluid
genome processes that can rapidly change genomes. These include
hypermutation, or mutations rates that are up to a million times faster
than usual, recombination, and horizontal gene transfer. Horizontal gene
transfer is well documented in all bacteria including E. coli, as
is clear from the genome sequence data. Recombination too, appears to be
an important mechanism in the evolution of the enterobacteria to which
E. coli belongs. And hypermutation has been identified in several
regions in the E. coli chromosome.
Another factor that would give an overestimate of divergence time is
artificial genetic engineering. Artificial genetic engineering involves
rampant recombination and transfer of genes across divergent species
barriers. Now that sequence data are becoming widely available, one ought
to be asking the serious question as to whether genetic engineering might
have contributed towards the emergence of E. coliO157 some twenty
years ago (3).
- Perna NT et al. Genome sequence of enterohaemorrhagic Escherichia
coli O157:H7. Nature 2001:409: 529-33.
- Eisen JA Gastrogenomics. Nature 2001: 409, 462-3.
- See Ho MW, Traavik T, Olsvik R, Tappeser B, Howard V, von Weizsacker
C and McGavin G. Gene Technology and Gene Ecology of Infectious
Diseases. Microbial Ecology in Health and Disease 1998: 10:
33-59; also "Genetic engineering
superviruses" by Mae-Wan Ho, ISIS Report March 2001