| Previous Page | Home Page |
Origins 21(2):91-108 (1994).
Related page —
| UPDATE
|
WHAT THIS ARTICLE IS ABOUT
Pseudogenes are DNA sequences that resemble functional genes but seem to have no purpose. The presence of similar eta globin pseudogenes in humans and chimps has been used as an argument for common ancestry of the two species. The argument has two parts: that the pseudogene sequences actually have no function, and that God would not create similar non-functional sequences in humans and chimps. The latter argument is theological and resembles many other theological arguments that have been proposed and later abandoned. Theological arguments should not be relied on unless well supported by Scripture.
The argument that the eta globin pseudogene has no function is consistent with most of the data, although lack of function has not been demonstrated. Possible function is suggested by the location of the pseudogene and differences in the extent of divergence of "intronic" and "exonic" sequences. The possibility that the eta globin pseudogene provides a binding site for a molecule involved in gene regulation has not been ruled out. At present, the evidence from pseudogenes fits reasonably well into an evolutionary interpretation, for those who choose to make that interpretation. However, there is much about the operation of the genome in general, and pseudogene sequences in particular, that is not well understood. Rapid progress is being made in understanding how the genome operates, and it is reasonable to expect that greater understanding of the meaning of pseudogenes will be forthcoming.
INTRODUCTION
Theists and naturalists have long argued over whether
nature provides evidence of design. Many theists have claimed that nature is so
designed that one can infer the existence of a designer. Some theists have made
the stronger claim that nature reveals a designer who is the Creator God
revealed in the Bible. Many examples of apparent design have been described,
ranging from the non-random properties of the universe to the intricate
mechanism of a living cell. To the theist, these features speak clearly of the
existence of an intelligent Creator who created with a purpose in
mind.
Naturalists have responded with arguments of their
own. One argument that is currently popular is the claim that many features in
nature are not designed very well. It is affirmed that such poor design
indicates either an inferior designer or the absence of a designer. Several
examples of allegedly poor design have been proposed (e.g., Miller 1994). One of
the most difficult examples for creationists to explain is probably the
existence of certain DNA sequences known as pseudogenes. This paper will explore
some of the characteristics of pseudogenes and their relationship to the
argument for or against design.
WHAT ARE PSEUDOGENES?
Ordinary structural genes are made of DNA sequences that
contain coded information for making a particular protein molecule. The
information includes a start signal, coding for the sequence of amino acids
needed to make up the protein, and a stop signal. Additional signals that
regulate the timing of gene activity are found adjacent to the gene, and often
also at some distance from it. The amino-acid coding sequence is often broken up
into portions known as "exons," which are separated by spacer sequences called
"introns." Pseudogenes are DNA sequences that appear similar to functional
genes, but contain important defects that appear to make them incapable of
producing a functioning protein (Proudfoot 1980). Defects of pseudogenes may
include lack of a start codon, presence of extra stop signals, and abnormal or
absent flanking regulatory elements. It is thought that mutations in pseudogenes
are neutral, and hence free from selection. The first report of a pseudogene was
in 1977 (Jacq, Miller and Brownlee 1977). Since that time, a large number of
pseudogenes have been described in humans and a wide variety of other species.
The supposed defects of pseudogenes have been used as an argument that nature is
too poorly designed to attribute its existence to special creation by a
supernatural Designer (Miller 1994).
Two types of
pseudogenes are known: unprocessed pseudogenes and processed pseudogenes.
Processed pseudogenes are so named because they appear to be altered (processed)
copies of active genes. Processed genes are found on different chromosomes from
their functional counterparts. They lack introns (spacer sequences within a
gene) and certain "upstream" (locked in front of the gene) regulatory sequences;
they often terminate in a series of adenines; and they are flanked by direct
repeats. ("Direct repeats" are associated with movable genetic elements, which
may in some cases play a role in inserting a pseudogene into a chromosome.)
Processed pseudogenes may be complete copies of the coding sequence, or may be
incomplete copies, or may have additional inserted sequences. They were first
found only in mammals (Vanin 1985), but have since been found in other groups
(Currie and Sullivan 1994). Processed pseudogenes are believed to have arisen in
a three-step process. The first step is copying of the DNA message into an RNA
transcript. The introns are then edited out of this transcript to produce a
messenger RNA (mRNA) molecule. Finally, the mRNA is copied back into a
chromosome in a process called reverse transcription (see Vanin 1985 for review;
see Tchenio, Segal-Bendirdjian and Heidmann 1993 for an example). The L1 family
of repetitive DNA sequences appears to be the result of this process (Jurka
1989).
Unprocessed pseudogenes are usually found within
clusters of similar, functional sequences on the same chromosome (Harris et al.
1984). They typically have "introns" and flanking regulatory sequences
resembling a functional gene. As with processed pseudogenes, expression of an
unprocessed pseudogene is generally prevented by stop codons. Numerous other
differences interpreted as deletions, insertions and point mutations may also be
present. A truncated mRNA transcript may or may not be produced. Unprocessed
pseudogenes are found in a wide variety of organisms. They are believed to have
arisen by gene duplication, which produced an extra copy of the gene. The extra
copy, not being needed, could accumulate mutations without harming the organism.
Examples of unprocessed pseudogenes are present in the alpha-globin and
beta-globin gene families (e.g., see Hardison and Miller 1993 and references
therein).
THE ARGUMENT FROM SHARED MISTAKES
When genes for equivalent proteins are compared in
different species, they are often found to differ in sequence. In general, the
more similar two species are taxonomically the more similar are their DNA
sequences, both in general and for specific enzymes. Exceptions do occur, but
the overall pattern is easily recognized. Two explanations have been proposed
for the observed pattern of similarities in molecular
sequences.
One explanation for sequence similarities is
that they are inherited from an evolutionary ancestor. Sequence differences are
attributed to accumulation of mutations since the species diverged from their
common ancestor. A second, contrasting, explanation is that sequence
similarities are due to common design for a similar function. Sequence
differences may reflect functional differences, such as might be required for
protein function in different metabolic environments, or regulatory functions in
different genetic backgrounds. Sequence differences may also be due to the
degeneration of the genome that seems to have taken place since the
creation.
Similarities in functional sequences for the
same protein in different organisms are to be expected, since they perform
similar functions; however, what about similarities in sequences, such as
pseudogenes, that seem to have no function? It has been argued (Max 1987,
Gilbert 1993, Miller 1994) that similar pseudogene sequences shared by two or
more species are best explained as the result of common ancestry, assuming that
an intelligent designer would not repeatedly make mistakes in creating genes.
This can be called the "argument from shared mistakes."
Comparison of DNA sequences from humans, chimp and other mammals reveals a
considerable number of shared pseudogenes that are similar in sequence as well
as in positional relationship to other genes. Humans and chimps have many
similarities; this is interpreted as indicating a recent common ancestry for
humans and chimps (Gilbert 1993). The best known example of a shared pseudogene
is the eta globin (psi beta globin) gene, a member of the beta globin gene
family.
THE BETA GLOBIN FAMILY AND THE (ETA GLOBIN) PSEUDOGENE IN HUMANS
Human hemoglobin molecules are made of two sets of
proteins, produced by the alpha globin genes and the beta globin genes. Both
beta globin and alpha globin genes occur in "families" of non-identical copies.
The beta globin gene family is located on the short arm of human chromosome 11
(11p15.5), near the gene for insulin (Lalley et al. 1989). A family of alpha
globin genes is also present in mammals, but it is located on a different
chromosome (16p13).
The beta globin gene cluster consists
of five somewhat-similar functional genes and one pseudogene. The five
functional genes are arranged on the chromosome in a sequence that corresponds
to the sequence of timing of their respective functions during growth and
development. The first gene in the series is the "epsilon globin" gene, which
helps form hemoglobin molecules early in embryonic development. The second and
third genes are called "gamma-G" and "gamma-A." They help form hemoglobin
molecules later during fetal development. The "eta globin" pseudogene is next in
sequence, followed by the "delta" globin gene which is produced at a low rate in
adults. The last gene in the series is the "beta" globin gene, which produces
most of the adult beta globin, and gives the gene family its name. As the adult
globin genes become functional, the fetal genes are turned off. The fact that
the sequence of the genes on the chromosome matches the sequence of their
activity in the developing organism seems unlikely to be the result of chance,
and can easily be interpreted as the result of intelligent
design.
The eta globin sequence has several
characteristics of pseudogenes. It resembles the other members of the beta
globin gene family, but is most similar to the gamma-A globin gene. However, it
has some important differences. Compared with the gamma-A globin gene, the eta
globin pseudogene lacks a start codon (AUG) in the appropriate position. It also
has numerous extra stop codons which would be expected to prevent production of
any protein. No mRNA transcript or protein product has been identified, and it
appears that none is produced. No medical defect is known that is traceable to
the loss of this pseudogene. In short, the eta globin sequence is not associated
with any known function or defect, and appears to be incapable of producing a
useful molecule.
The beta globin gene family is also found
in other mammals. Sequences of the human gamma-A globin gene and eta globin
pseudogenes from humans and several other species have been compared (Chang and
Slightom 1984). The human gamma-A globin gene contains three exons (portions of
the DNA that code for amino acids) of 92, 223 and 129 nucleotides, respectively,
for a total of 444 nucleotides. The corresponding "exons" of the human eta
globin pseudogene differ from the gamma-A globin gene exon sequences in 29, 38
and 43 nucleotide positions, respectively, for an overall difference of 24.8%.
The gamma-A globin gene has two introns of 122 and 877 bases, respectively.
These differ from the "intron" sequences of the eta globin pseudogene by 46-79%
and 72-94%, respectively (my figures differ somewhat from those of Goodman et
al. 1984, probably due to problems in aligning the sequences). The
gamma-A-globin exons and pseudogene "exons" are more similar to each other than
expected from random sequences, while the "intronic" sequences are so different
that no relationship among them can be inferred.
COMPARISONS OF ETA GLOBIN PSEUDOGENES IN HUMANS AND OTHER PRIMATES
The arrangement of the beta globin gene family in other
primates is very similar to that in humans (Harris et al. 1984). Humans,
chimpanzees and gorillas have the same number of beta globin genes arranged in
the same sequence. In chimpanzees, the beta globin group is on chromosome 9,
which is equivalent to human chromosome 11 (Lalley et al. 1989). Baboons have a
similar arrangement, but the delta globin gene appears non-functional, and is
classified as a pseudogene. The New World owl monkey has only one gamma globin
gene, with a possible partial second gene (Meireles et al. 1995), but the
arrangement of genes is otherwise the same as in humans. This is true also for
the galago ("bush baby"; Hardison and Miller 1993). Among non-primates, the
rabbit has only one gamma globin gene, but lacks the eta globin pseudogene,
while the delta globin gene appears to be a pseudogene.
The DNA sequences of the eta globin pseudogene exons in humans, chimpanzees and
gorillas are similar. The chimpanzee eta globin pseudogene exonic DNA differs
from the human eta globin pseudogene at six nucleotide positions and from the
corresponding gorilla pseudogene at seven positions. The gorilla pseudogene
exonic DNA has three differences from humans and seven from chimpanzees. This
means that chimpanzee and gorilla eta globin exon sequences are both slightly
more similar to the human pseudogene than to each other.
It is clear that the "exon" portions of the eta globin pseudogenes in humans,
chimps and gorillas are highly similar. None of the differences involves any of
the eight stop codons in the pseudogenes. Several potential initiation codons
(AUG) are present, and one of the differences in the chimpanzee produces an
additional potential initiation codon in the second exon. However, none of these
is sufficient to support protein coding function.
GENE DUPLICATION HYPOTHESIS
If evolution is to occur, new genes must somehow be
produced. The most popular explanation for the evolution of new genes is that
they are modified from extra copies of existing genes. This explanation is known
as the gene duplication hypothesis (Ohno 1970). According to this hypothesis,
functional genes may be duplicated accidentally. The duplicate gene is not
needed by the organism. Both copies of the gene may be subject to selection
until one of them suffers a disabling mutation, such as a premature stop signal.
This disables the gene so it no longer has any function, and is no longer
subject to natural selection. It has become a pseudogene, and all subsequent
mutations are neutral. Over time, mutations accumulate in the pseudogene.
Eventually, according to the theory, random mutations may produce a new gene
with a new function (e.g., see Long and Langley 1993).
There is ample
evidence that gene duplication does occur (e.g., see references in Lazcano and
Miller 1994). An apparent example of parallel gene duplication in flies has been
described (see Menotti et al. 1991). However, whether new information can be
generated by random mutations in duplicated genes is another
question.
The gene duplication hypothesis holds that
mutations in duplicated genes have served as the source of additional genetic
information in complex organism. Although widely accepted, this hypothesis is
not without some theoretical and empirical difficulties. Assuming the original
gene had been optimized by selection, mutations in the coding region of the
duplicated gene prior to a disabling mutation would likely result in production
of inferior protein molecules. Individuals with one gene that produced inferior
protein products would likely be selected against. Spread of a duplicated gene
should be difficult under these conditions. This problem could be reduced if
mutations destroyed the function of the extra gene copy early in its history.
However, there are only three stop codons, while there are 61 codons for amino
acids. One would expect mutations resulting in destruction of function to be
much less common than those resulting in production of variant proteins, most of
which could be expected to be inferior. Selection may also oppose maintenance of
a pseudogene, since it may retain enough activity to disrupt normal cellular
activities. Some pseudogenes are suspected to be involved in causing certain
diseases (e.g., Wedell and Luthman 1993, Brakenhoff et al. 1994), which should
result in negative selection against them. Thus, establishment and maintenance
of a pseudogene by gene duplication may require a rather special sequence of
events.
Another problem for the gene duplication
hypothesis is that the existence of duplicate copies of a gene does not
necessarily permit one of the copies to diverge from the others. For example,
seven copies of the "Enhancer of split" gene are present in Drosophila,
but it appears that none of them is free to mutate (Maier et al. 1993). The
"duplicated copies" are not extra, but all seem to be required. Many genes occur
in multiple copies that remain similar to each other rather than diverging. This
has been explained as due to a process known as gene conversion, in which one
DNA sequence is "converted" during copying to match another sequence. This may
result in maintenance of similarity among several copies of a sequence. The
situation in which multiple copies of a sequence maintain close sequence
similarity is known as "concerted evolution" (e.g., Moore et al. 1993).
Concerted evolution would tend to prevent divergence of duplicated genes, thus
presenting a problem for the gene duplication hypothesis. Tetraploid species
have far fewer pseudogenes than would be expected (Larhammar and Risinger 1994),
which seems counter to expectations of the gene duplication hypothesis.
BETA GLOBIN GENES AND THE GENE DUPLICATION HYPOTHESIS
It is thought that the eta globin pseudogene originated by
duplication of a gamma-A globin gene, because of the similarity in their
sequences. Both genes are present in all primates studied. Other mammals may
have one or the other of the two genes. For example, gamma globin, but not eta
globin, genes are present in rabbits; goats have eta globin but not gamma globin
genes (Hardison and Miller 1993); the opossum has neither (Goodman et al.
1987).
It would be useful to review the evolutionary
explanation for the distribution of eta globin genes in mammals. The proposed
explanation is that the common ancestor of marsupials and placental mammals
lacked both genes. After the evolutionary divergence of the marsupials, the
gamma globin gene formed by duplication of an existing gene in the beta globin
family. Later, but before radiation of the orders of placental mammals, the eta
globin gene formed from a duplicated gamma globin gene. This second supposed
gene duplication is estimated to have occurred at 140 million years ago (Harris
et al. 1984). Gamma and eta genes must both have been present in ancestral
placentals, but presumably gamma was lost by goats and eta was lost by
rabbits.
According to this scenario, the eta gene must
have been functional at first, because it is functional in goats. It is
non-functional in all primates, which is interpreted to mean it was already
non-functional in the ancestral primates. According to Martin (1993), primates
probably originated in the Late Cretaceous, perhaps 70 to 80 million years ago.
This interpretation implies that the eta globin pseudogene has been maintained
for more than 70 million years without being converted into a useful new gene
and without being eliminated. The persistence of a non-functional DNA sequence
in an entire lineage for such a supposed long period of time seems remarkable in
the context of the gene duplication hypothesis.
The gamma
globin gene is believed to have duplicated a second time, producing the A-gamma
and G-gamma genes. Humans, apes, Old World monkeys, and some New World monkeys
have two functional gamma globin genes. Other mammals, including galagos,
tarsiers and rabbits, have only a single gamma globin gene (Hayasaka et al.
1993, Hardison and Miller 1993). To explain this, the gamma globin gene is
postulated to have undergone a second duplication after divergence of simians
and tarsiers. Current interpretation of the fossil record of primates (Martin
1993) suggests that simians and tarsiers diverged during the Paleocene, perhaps
60 million years ago. It seems remarkable that both copies of a duplicated gene
could remain functional for 60 million years if evolution has depended on gene
duplication for the source of new genetic information.
THEOLOGICAL PRESUPPOSITION IN THE ARGUMENT FROM SHARED MISTAKES
Several factors need to be considered in interpreting DNA
sequence similarities in the eta globin pseudogenes. The argument has been
presented that eta globin pseudogene similarities are compelling evidence of
shared ancestry. This argument rests almost entirely on two assumptions: (a)
that the eta globin pseudogenes have no function, and (b) that God would not
create similar non-functioning sequences in separate species. Thus these
assumptions must be carefully examined.
The argument that
God would not act in a certain way is a theological argument, and can hardly be
addressed by science. The validity of such an argument depends on the kind of
God being postulated. The kind of God at issue for most of those involved in
this discussion is the God who revealed Himself in the Bible. The question then
is: What do the scriptures say about whether God would create structures or DNA
sequences for which we can find no use in unrelated organisms? This subject is
not addressed in the Bible, leaving us without an answer. We can postulate that
God would not do such a thing, but this position would not be based on any
evidence other than our own presuppositions, however reasonable they
seem.
Another theological argument that has been advanced
against some proposed actions of God is that God would not deceive us by acting
in certain ways. This is equivalent to claiming that our understanding of nature
can be trusted to accurately reveal God's activities. This argument is
especially dangerous because it places human reason above divine revelation. The
scriptures do state clearly that God does not deceive us (Titus 1:2), but they
also make it clear that we are naturally prone to make wrong conclusions (Romans
11:33-36). The scriptures reveal the truth about history. When God tells us in
Scripture that He created in a certain way, we need not be deceived by what we
believe to be appearances to the contrary. Our experience should teach us that
much. The argument that we can figure out what God would or would not do has not
done well historically. At various times it has been claimed that God would
create only perfectly circular orbits for the planets, or that God would create
only perfect species that would not need to adapt to changing circumstances, or
that God would not permit man to contaminate space. None of these arguments has
survived. Claims about God's activities should be based on Scripture.
SCIENTIFIC PRESUPPOSITION IN THE ARGUMENT FROM SHARED PSEUDOGENES
A second assumption underlying the argument from shared
mistakes is that shared pseudogenes, in this case the shared eta globin
pseudogenes, have no function. Has it been demonstrated that these sequences
have no function?
It is difficult to completely rule out
any possibility of polypeptide production based simply on coding sequence.
Examples are known in which the apparent DNA message is altered by RNA editing,
reading frame-shifting or skipping parts of sequences (Benhar and
Engelbert-Kulka 1993, Dietz et al. 1993, Gesteland et al. 1992, Landweber and
Gilbert 1993). Nevertheless, the available evidence seems to suggest that the
eta globin pseudogene does not code for any protein. No RNA transcript or
protein product has been identified. Each of the three "exons" contains at least
one stop codon in each of the three "reading frames." ("Reading frames" differ
in which nucleotide of each base triplet is used as the starting point.) Seven
potential start codons are present, but none of them is in "exon" one. These
potential start codons are not sufficient for protein coding function. However,
some pseudogenes may produce small amounts of polypeptides in specific tissues
(Weinshank et al. 1991; Bristow et al. 1993; Misra-Press, Cooke and Liebhaber
1994); so it is difficult to rule out the possibility that the eta globin
sequence might produce a polypeptide.
DNA strands come in
complementary pairs. One might wonder whether the DNA strand complementary to
the pseudogene might have some function, but there seems to be no information
available regarding this.
The eta globin pseudogene does
not appear to function in chromosomal structure. Chromosomes are organized into
loops that are attached at their bases to a nuclear material often called the
nuclear scaffold. Scaffold-associated regions are present within the beta gene
cluster, and one of them is located near the eta globin pseudogene (Jarman and
Higgs 1989). However, it appears that the scaffold-associated region is not
within the pseudogene sequence itself, making it unlikely that the pseudogene
sequence functions in chromosomal structure.
The
observation that the eta globin pseudogene is not associated with any known
genetic defect is offered as further argument for its lack of function. Several
hemoglobin beta globin abnormalities are known, but none of them is associated
specifically with the eta globin pseudogene (Stamatoyannopoulos and Nienhuis
1994). This is interpreted as supporting the assertion that the pseudogene has
no function. However, this argument is quite weak. The same result could occur
for lethal mutations. No defective individuals would be observed because they do
not survive long enough to be observed. Individuals with defective pseudogene
sequences have been reported, but their abnormal hemoglobins were attributed to
deleted portions outside the pseudogene sequence. It would be helpful to know
whether normal individuals exist without the pseudogene sequence. Unless more
information is available, the argument that the eta globin pseudogene has no
effect on health cannot be said to have been demonstrated.
The possibility that pseudogenes may have some function is worth exploring
further. Some pseudogenes are believed to function as sources of information for
producing genetic diversity (Fotaki and Iatrou 1993, Wedell and Luthman 1993),
possibly involving a process similar to gene conversion. It is thought that
partial pseudogene sequences are copied into functional genes, producing
variants of the functional sequence. This phenomenon has been reported many
times. Some examples include the immunoglobulins of mice (Selsing et al. 1982)
and birds (Reynaud et al. 1989), mouse histone genes (Liu, Liu and Marzluff
1987), and in horse globin genes (Flint, Taylor and Clegg 1988) and human beta
globin genes (Fullerton et al. 1994). The possible role of the eta globin
pseudogene in gene conversion is unknown.
Regulation of
globin genes is not fully understood, but several regulatory sites and protein
factors have been identified (Stamatoyannopoulos and Nienhuis 1994). Each of the
five functional beta globin genes has its own promoter region that participates
in gene regulation. In addition, a locus control region (LCR) is found in a
region several thousand bases upstream from the gene for epsilon globin, which
is the first to be expressed.
There is no evidence that
the eta globin pseudogene functions in gene regulation of the beta globin gene
family (Engel 1993). However, that possibility has been suggested (Goodman et
al. 1984; see also Vanin et al. 1980). The chromosomal arrangement of beta
globin genes in a sequence corresponding to the timing of their activity is
striking. It appears that chromosomal location plays an important role in beta
globin gene regulation (Dillon et al. 1991).
The fact that
the eta globin pseudogene is located between the fetal and adult genes suggests
that it could play a role in gene switching — turning off the fetal gamma genes
and turning on the adult beta gene. There is evidence that gene switching in
human beta globin genes depends in some way on the sequence lying between the
fetal and adult genes (Townes et al. 1991), although it is not known whether the
eta globin sequence itself is involved. Some pseudogenes have been implicated in
gene regulation (Singh and Brown 1991; Assinder et al. 1993; Koonin, Bork and
Sander 1994). Such a role could involve competition for regulatory proteins,
production of signal RNA molecules, or perhaps some other mechanism (e.g., see
Enver et al. 1991).
Further suggestion of possible
functionality of the eta globin pseudogene comes from a comparison of the
"non-functional" sequences in humans and chimps. Non-functional sequences in
this case include the A-gamma gene introns and the entire eta globin pseudogene.
One would expect a similar rate of mutation in all non-functional sequences. We
can test this by comparing the extent of difference between various regions of
the non-functional sequences. Human and chimp A-gamma introns differ by 23 of
999 positions (2.3%). The respective eta globin "introns" differ by 16 of 999
positions (1.6%). The "exons" in the eta globin pseudogene differ by only 6 of
444 positions (1.35%). The figures for A-gamma introns and eta globin exons
differ by more than one-third. This could be explained as due to variations in
the mutation rate, but this would tend to undermine the argument that
differences in non-functional sequences are a function of time (the molecular
clock hypothesis). It seems reasonable to suspect that mutations in the eta
globin pseudogene "exons" are constrained, perhaps because it has some function
that has yet to be discovered (cf. discussion of Drosophila Adh locus in
Sullivan et al. 1994).
Another presupposition of the
argument from shared mistakes is that they could not have arisen independently,
but must have been inherited from a common ancestor. Although convergence and
parallelism are common problems in morphological studies (e.g., Carroll and
Currie 1991), it seems improbable that identical nucleotide changes would occur
independently. However, there is some evidence that nucleotide changes may not
be random. Mutational "hotspots" (e.g., Hardison et al. 1991) have been
identified, and independent gene duplication events have been inferred (Menotti,
Starmer and Sullivan 1991).
ARE PSEUDOGENES "JUNK DNA"?
It has been thought that only a small proportion of DNA
codes for proteins. Typical estimates have been that perhaps 3% of the genome is
involved. Recent discoveries (Wilson et al. 1994) indicate a figure closer to
30% for a roundworm. The figure for humans is not yet known, but it seems
reasonable to expect a similar number. What is the function of the remaining
portion? A large amount of DNA would be required for gene regulation, but this
still leaves a significant part of the DNA with unknown function. That DNA
fraction with no apparent function has been called "junk DNA." Junk DNA has been
thought to include intervening sequences (introns), satellite DNA (a highly
repetitive DNA fraction), repetitive sequences, and
pseudogenes.
As knowledge of the genome has increased,
functions have been discovered for some of the sequences thought to be "junk"
(Nowak 1994). For example, introns function in splicing together transcripts of
exons. This constrains the kinds of changes that intronic sequences can
tolerate. Some introns contain coding sequences which produce functional gene
products (see Doolittle 1993 for review). Satellite DNA appears to be involved
in chromosomal structure, especially at the ends (telomeres) and attachment
points (centromeres) of the chromosomes. Repetitive DNA seems to have effects
that are not well understood. Some diseases seem to be related to repetitive
sequences (see Maddox 1994). It was recently noted that repetitive sequences
seem to have a genomic arrangement characteristic of some kind of information
code (Flam 1994), although the test used for this is apparently weak. Some
supposed pseudogenes have been shown to be lowly or selectively transcribed
(e.g., Yaswen et al. 1992; Imai et al. 1993; Vazeux, le Scanf and Fandeur 1993),
which might suggest some function. The list of DNA sequences that have no effect
on the organism has steadily decreased as knowledge of the operation of the
genome has increased. This is reminiscent of the history of vestigial organs, in
which apparent lack of function was actually lack of knowledge about the
function. There is still much about pseudogenes that is not understood (Sullivan
et al. 1994).
Organisms may have been originally created
with complex genomic systems which have since degenerated. Perhaps this
degeneration led to the production of useless DNA sequences. Perhaps copying
errors, unequal crossing over, and disruptive transposition have all contributed
to this process. Some pseudogenes may be junk DNA. However, the argument that
particular DNA sequences must not have a function because we haven't discovered
any function for them is an argument from silence. The discovery of genetic
activity among some sequences thought to be pseudogenes justifies caution in
interpretation. Scientists will continue to learn more about how DNA sequences
interact. They will undoubtedly discard some of their present ideas and discover
new principles of genetics. Perhaps a purpose for pseudogenes will be one of
them.
ACKNOWLEDGEMENTS
The author wishes to thank several reviewers for helpful comments and suggestions.
LITERATURE CITED
All contents copyright
Geoscience Research Institute. All rights reserved. | Origins
| Geoscience Reports
|
Send comments and
questions to webmaster@grisda.org
| Previous Page | Home Page |