|Thread ▼ Details|
|Author||Topic: Towards a Hypothesis of Molecular Design|
Member (Idle past 1366 days)
Towards a Hypothesis of Molecular Design
Though scholars in the ID movement have continually argued that Darwinian evolutionary processes are insufficient to account for the biochemical complexity that is at the heart of life, relatively little effort has been devoted to developing a testable hypothesis of intelligent design. Yet if intelligent design is to make any significant progress in academia, and if it is to lead to fruitful research, then a positive intelligent design hypothesis is sorely needed. In a previous essay, a testable hypothesis on the engineering of molecular machines was briefly described. Here, I expand on that model of biological intelligent design and discuss several predictions that necessarily follow from the hypothesis. The hypothesis will be termed molecular design. The molecular design hypothesis proposes that the components of molecular machines were engineered through the strategy of rational design, similar to the method humans use to design proteins. This, however, is only a cursory summary. More specifically, the central thesis of molecular design is that the first biological cells were contrived by engineers with intelligence analogous to our own, and that the protein machinery of these cells was designed by methods similar to our present techniques for protein design. Thus, it suggests that the method or mechanism of the intelligent design of the biochemical complexity of life was essentially rational design and directed evolution. Bear in mind that directed evolution is a very specific method of protein design, and is not synonymous with the step-by-step evolution of molecular machines.
This essay will begin with an overview of rational design and directed evolution in protein design by our species. By exploring these two mechanisms of protein engineering (and more broadly, the engineering of molecular machines), predictions of the molecular design hypothesis can be more fully fleshed out.
Methods in Protein Design: Rational Design and Directed Evolution
Rational design involves either modifying an existing protein sequence in a determined, specified way or designing an entirely novel protein. The latter approach will be discussed later. Below is a figure (Figure 1) that depicts the general procedure behind the rational design of proteins. In rational design, knowledge of the structure and function of the protein is critical. By analyzing the structure and function of the protein, one can predict the effects of changing certain amino acids. For example, changing key amino acids might enable a receptor to bind more tightly with its ligand, increasing functionality.
Figure 1. The basic process behind the rational design of proteins.
Using site-directed mutagenesis, specific amino acid residues can be mutated in the desired way. The process of site-directed mutagenesis is probably familiar to all biology majors, so I am only including a figure (Figure 2) to illustrate how site-directed mutagenesis works.
Figure 2. Site-directed mutagenesis.
Protein sequences can also be designed de novo. That is, a novel protein can be designed from scratch instead of by modifying an existing protein. Synthetic DNA sequences are ultimately cloned and expressed, leading to the translation of the de novo protein sequence. There are numerous examples of protein sequences that have been designed de novo (see, e.g., Kuhlman et al., 2003; Fisher et al., 2011). Importantly, protein sequences designed de novo have no homologous counterparts.
Under the heading of rational design comes the method of blob-level protein design (see Figure 3). As stated in the figure, the main idea behind blob-level protein design is to combine protein units of defined function (domains) to engineer a fusion protein with novel functionality.
Figure 3. "Blob-level" protein design.
Unlike the rational design technique, directed evolution emphasizes selection to a higher degree. Very simply, a gene sequence encoding a given protein is randomly mutated (through error-prone PCR, for example), which results in a library containing many variants of the sequence. Selection is then utilized to select the sequences which possess the desired function. Next, the selected sequences are amplified through PCR, and the process is repeated as many times as necessary.
The principle methods behind protein engineering, described above, are employed for the design of single protein molecules. But what about molecular machines, which are composed of several (often dozens) of protein parts? How can such multi-component machines be engineered? Designing a protein machine would begin, of course, with planning the arrangement of protein parts (and the kinds of proteins that would be needed) such that function is produced. After this has been accomplished, the following steps would be carried out:
a. The inner, core proteins of the machine would be engineered first (through rational design and directed evolution). De novo design of the first core protein would be followed by the design of a protein that could tightly bind to it. Alternatively, a protein from another machine would be borrowed so that de novo design would not be necessary.
b. More and more proteins would be designed and added to the initial core proteins.
c. Once the genes encoding the necessary proteins have been designed, genes regulating the assembly of the machine would be engineered.
After the machine is designed, it can be modified and changed to produce a machine with a different function. These methods, then, are the major techniques behind protein engineering. Several salient points emerge that are worth mentioning:
In rational design of molecular machines, protein parts are often borrowed from other systems (and modified as necessary to produce optimal functionality). This would result in statistically significant levels of sequence similarity between the components of the machine and other proteins that are not part of the machine.
Proteins that are designed de novo share no statistically significant sequence similarity with other proteins (in general).
When it comes to designing molecular machines, there is no step-by-step, cumulative evolution, wherein every step offers a selective advantage. Instead, the components of the machine are integrated at approximately the same time. Millions of years (and not even decades) are not needed to engineer a molecular machine with dozens of components. This is because the engineering approach has foresight.
Individual parts of proteins (domains) can be swapped around to design specific functions.
Tests of the Molecular Design Hypothesis
What predictions naturally follow from the molecular design hypothesis? Let us suppose that we find a biological machine X, and that some of its parts are similar with a machine Y. Under current theory, this similarity is attributed to a shared ancestry. Yet it is this very similarity that can act as a springboard for testing the molecular design hypothesis. The bacterial flagellum is probably the most familiar icon of the intelligent design debate, so it will be used as an example in the predictions discussed below.
Prediction 1: Molecular clock analyses of protein sequences of the components of the machine should reveal a specific pattern. Under the Darwinian model, the evolution of a machine like the flagellum proceeds through the stepwise co-option of parts. For example, the flagellar ATPase would be borrowed from cellular F-ATPases and integrated with a primitive membrane pore. Next, after the evolution of gated proteins, etc., proton channels (proto-ExbB/proto-ExbD) would be co-opted to form the flagellar motor proteins MotA and MotB, respectively. A proto-MgtE copy would then associate with the flagellar motor, resulting in a FliG protein. Thus, under the Darwinian model of the origin of the flagellum, molecular clock analyses of the flagellum-specific ATPase and the F-ATPase, MotAB and ExbBD, and FliG and MgtE protein sequences should show divergence times in the following chronology: the split between the flagellum-specific ATPase and the F-ATPase --> the MotAB/ExbBD divergence --> and the FliG/MgtE split, where the arrows denote the passing of time.
In contrast, the molecular design hypothesis predicts that the divergence times of the machine parts and their homologs should follow a different pattern. One might expect that the divergence times should be approximately equal, indicating that the parts of the machine originated simultaneously, but this would ignore the fact that modification of protein parts would often be necessary. As a simple example, it is clear that simply plugging the ExbB/ExbD proteins into the flagellum for motor purposes would not be optimal. ExbB and ExbD interact with a membrane protein known as TonB, so ExbB and ExbD have segments unrelated to flagellar function. If these segments were not altered in the right ways, then they could interfere with the function of the flagellum. This engineered modification of homologous components must be taken into account. As I explained earlier:
this means that we cannot logically predict — from a design perspective — that molecular clocks will demonstrate that all machine components originated at about the same time. This is because if some proteins are modified more substantially than others, it would confuse the molecular clock. Protein components that have undergone more drastic modifications will have the appearance of being more ancient (using a molecular clock), while proteins that are only slightly changed will appear to have originated more recently. In particular, the design hypothesis predicts that molecular clocks will show that proteins with rapid substitution rates will have a later origin, while proteins with slow substitution rates will have an early origin. We can summarize this prediction in this manner: in general, the slower the substitution rate, the more ancient the protein will appear to be. If a protein has a slow substitution rate, then any modifications to the sequence of that protein will give the appearance of a large amount of time passing by. In contrast, even fairly extensive modifications to a protein with a rapid substitution rate will not significantly affect the molecular clock. To further refine this prediction, we can take into account the amount of modification that would be needed for a given protein. (Included here is the figure that was provided in the previous essay)
Figure 4. A summary of the prediction of the molecular design hypothesis.
How can we make an estimate of the amount of engineered modification needed for any given component? This can be accomplished by taking a holistic approach to the protein parts and systems in question. That is, by analyzing the functions, structures, and interactions of the proteins and systems, one can infer approximately how much modification would be needed to incorporate a borrowed part into an engineered machine. A close inspection of the flagellar rod proteins, for example (which are homologous to each other), reveals that only a minimal amount of engineering would be needed to modify one rod protein into another. Their functions and interactions are similar, as is their cellular localization. Relative to the flagellar rod proteins, a great deal of modification would be needed to turn an MgtE copy into FliG. MgtE is a magnesium transporter and is not part of a multi-component protein system, while FliG interacts with the flagellar MS-ring and motor, as well as with FliM/N.
Figure 4 summarizes the prediction of the molecular design hypothesis discussed above.
Prediction 2: Molecular clock analyses of synonymous sites in the proteins under consideration should demonstrate approximately equal divergence times. While engineers might find it necessary to modify a protein part borrowed from another system, this modification would take place at the amino acid level. However, molecular clock analyses are not restricted to amino acid sequences. Clock analyses can also be conducted on the synonymous sites of different two gene sequences. Since the synonymous sites would not be affected by any engineered modifications (given that the engineered changes would be done on the amino acid level), then the divergence times of machine parts as estimated from synonymous sites should be approximately equal. Furthermore, clock analyses carried out on the basis of synonymous sites should be in disagreement with clock analyses performed on non-synonymous sites.
An example may be cited here. Suppose the bacterial flagellum was engineered. As such, FliG was adapted from MgtE and integrated into the flagellar system, and the flagellar ATPase was borrowed (and modified as necessary) from the F-ATPase. Through molecular clock analyses, the divergence times of FliG/MgtE and the flagellum-specific ATPase/F-ATPase could be calculated. The divergence times would be expected to match the prediction described in Prediction 1. However, keep in mind that, in reality, these parts are being engineered into the flagellum at approximately the same time. So if we could find a molecular clock method that is independent of functional requirements, it would be possible to determine if these parts truly did originate at the same time. Fortunately, a molecular clock method based on synonymous sites provides such a method. Since the synonymous sites are not modified (unless their source organisms are significantly different) by any engineering methods, clock analyses of synonymous sites should give the actual divergence times of these proteins, and those divergence times should be nearly equal.
In the example above, then, all synonymous sites of the MgtE and FliG genes would be taken into consideration, and the synonymous substitution rate calculated. Next, the number of synonymous substitutions between MgtE and FliG would be determined, and thereby the divergence times of the two proteins could be established. This same procedure would be employed for the flagellar ATPase and the F-ATPase. The molecular design hypothesis predicts that the calculated divergence times of these two pairs of proteins — based on synonymous sites — should be approximately the same. Treatments of the methods and techniques behind molecular clock analyses will be found in molecular evolution and bioinformatics textbooks; these should be consulted if the reader is interested in a further understanding of the subject.
I will now describe two predictions of the molecular design hypothesis that arise from special cases.
Prediction 3: Fusion proteins. Fusion proteins are proteins that are composed of two or more fused proteins — proteins that originally functioned independently. Recombinant technology is widely used to create fusion proteins, and fusion proteins can also arise through random mutations. If a fusion protein is a component in a biological machine, this provides an opportunity to test the molecular design hypothesis. In Figure 5, protein C is a protein component in a molecular machine hypothesized to have been engineered. Protein C is a fusion protein: the red portion is similar to protein A, and the green portion is similar to protein B. To engineer protein C, these two proteins (A and B) must be fused together. But this is not always the whole picture. For the fusion protein C to function in the context of the engineered machine, modification to proteins A and B would be done. In other words, the protein sequence of protein A would have to be tweaked in just the right way so that it fits nicely with the rest of the machine, and the same applies to protein B. There is one more element to this, however. It is not likely that both proteins A and B would have to be tweaked to the same degree, since these are, after all, proteins with different functions. So the sequence of protein A, for example, is modified more substantially than protein B. The two proteins are then fused using recombinant DNA technology, and the resulting protein is integrated into the engineered machine. The molecular clock starts ticking for each of the two domains in the fusion protein (the red and green portions). This is where the prediction stems from. Based on the molecular design hypothesis, I would predict that molecular clock analyses of the different parts of the fusion protein (that is, molecular clock analyses of each domain in the fusion protein with its homologous counterpart) should yield different divergence times. The logic of this prediction is simple: (a) the parts A and B were modified to different degrees, skewing the molecular clock, and making one appear more ancient, (b) the molecular clock then starts ticking (once they have been fused and the machine has been deployed in the wild), with the fusion protein accumulating substitutions. Since the original modification to the proteins A and B skewed the molecular clock, the molecular design hypothesis predicts that in general, molecular clock analyses of the parts of a fusion protein should show one of the parts to be more ancient than the other.
Figure 5. A fusion protein engineered from two different proteins. The asterisks represent the amount of modification that has been done to each protein part, as well as the subsequent accumulation of substitutions.
Here it should be emphasized that the Darwinian theory of the origin of molecular machines leads to a different prediction. Within the evolutionary framework, proteins A and B simply fuse, and at that moment the molecular clock starts ticking for the novel fusion protein. The prediction would be that the different parts of the fusion protein diverged at the same time from their respective homologs.
Prediction 4: Protein domains. This prediction concerns duplicate protein domains that carry out the same function in the same protein. Suppose we have a protein component, A, which consists of the domains B, B, and C. The two B domains serve the same function in protein A, and are homologous to a domain in another protein (this ancestral domain will be termed B1). Since the two B domains function in the same way, an equal amount of modification would be done to them (if any at all). From an engineering perspective, the two domains would be placed in protein A at the same time. They would subsequently diverge. When compared to B1, then, they should be genetically equidistant. Therefore, molecular design predicts that domains with the same function in a protein should be equidistant from their ancestral domain.
Again, this is not predicted if a molecular machine is the product of evolutionary processes. Duplicate domains need not arise simultaneously. Instead, a domain can be incorporated in a protein, and only later a second domain of the same type might be integrated with the protein. It is true, of course, that duplicate domains can evolve at the same time. But this is not a prediction of current theory. Current theory would be able to explain the above observation, but it would not predict it.
I have endeavored to describe the molecular design hypothesis in greater depth than in the previous essay. Several predictions of the hypothesis were delineated; there are undoubtedly more that were not discussed here. The next step from here is to actually test the hypothesis. This can be accomplished using standard bioinformatics techniques. By experimentally determining if these predictions are met it will be possible to detect artificiality — or the lack thereof — in molecular machines. Having said this, I wish to strongly encourage intelligent design proponents to do more than simply critiquing the current view on biological origins. A mechanistic model of intelligent design is what is needed, and this is what I have attempted to outline.
Kuhlman, B., Dantas, G., Ireton, G.C., Varani, G., Stoddard, B.L., Baker, D., 2003. Design of a novel globular protein fold with atomic-level accuracy. Science 302(5649), 1364-8.
Fisher, M.A., McKinley, K.L., Bradley, L.H., Viola, S.R., Hecht, M.H., 2011. De novo designed proteins from a library of artificial sequences function in Escherichia coli and enable cell growth. PLoS One 6(1), e15364. doi: 10.1371/journal.pone.0015364.
From: EvC Forum
Member Rating: 1.3
Thread copied to the Towards a Hypothesis of Molecular Design thread in the Intelligent Design forum, this copy of the thread has been closed.
Copyright 2001-2022 by EvC Forum, All Rights Reserved
Innovative software from Qwixotic © 2022