Protein Structure and Protein Folding

Recombinant DNA techniques have provided tools for the rapid determination of DNA sequences and, by inference, the amino acid sequences of proteins from structural genes. The proteins we observe in nature have evolved, through selective pressure, to perform specific functions. The function properties of proteins depends upon their three-dimensional structures[4].

      To understand the biological function of proteins we would therefore like to be able to deduce or predict the three-dimensional structure from the amino acid sequences. However, in spite of considerable efforts over the past 30 years, this folding problems is still unsolved and remains one of the most basic intellectual challenges in molecular biology[4].


      According to the experiments first performed by Christian Anfinsen in the 1950s, protein unfolding and refolding are thermodynamically reversible process. By changing the environment conditions, the protein unfolds and loses its activity, but once the environment is returned to the physiological condition the protein folds up spontaneously to its native structure and regains its activity. Thus, protein folding is apparently thermodynamically determined[7]. Anfinsen referred this as the “thermodynamic hypothesis”. This hypothesis states that the three-dimensional structure of a native protein in its normal physiological milieu is the one in which the Gibbs free energy of the whole system is lowest; that is that the native conformation is determined by the totality of interatomic interactions and hence by the amino acid sequence, in a given environment[2].

      From the thermodynamic hypothesis, the amino acid sequence appears to contain all the necessary information to make up the native 3D structure. Basic on this principle, most of the recent methods to predict the native 3D structure of a protein from its amino acid sequence are semi-empirical. Statistical methods are widely used in the prediction. The general idea for such prediction includes two steps:

      1. Predict the secondary structures (or structure of some segments) by threading or alignment;

      2. Assemble the segments to minimize the energy to find out the 3D structure.

However, being not know the principle that how a protein folds to its “unique” native state, such methods are success only for specific proteins.

      In addition to the semi-empirical method, some people try to predict the native structure by searching the structure with lowest free energy. However, this can not be done yet. Because of computational limitations, it is not possible to generate an ensemble of structures that adequately represent the equilibrium ensemble for even the smallest globular proteins[9]. Moreover, there is still debate over whether the native folded state of the protein represents a thermodynamically dominant state or a kinetically trapped state[8].

      In [8], the authors points out that one central question is whether this conformation (the native conformation) is under thermodynamic or kinetic control; i.e., Whether the native protein conformation corresponding to the most stable (thermodynamic control) or to the kinetically most accessible (kinetic control) conformation. There has been no proof of either of the two situations. The original experiments of Anfinsen seem to support the thermodynamic hypothesis. Recent experiments, however, have suggest that there may be exceptions, especially for larger and more complex proteins[3]. It seems likely that kinetic control is more prevalent in complex cellular processes.

      The issue for this debate is to calify the condition under which a protein would folds follow the thermodynamic control.

      For those proteins that follow thermodynamics control, prediction is possible in principle, by attempting to find the most stable state. However, for those proteins under kinetic control, it is impossible to predict the “native conformation” unless the mechanisms of folding are known.

      Instead of attempting to resolve the problem of protein folding just by devise more powerful mathematical and computational methods, we propose to adopt a different and more basic approach. We wish first to examine the problem of protein folding by understanding the mechanisms of protein folding. This breaks up the complex problem of prediction into several simpler problems, each with its own characteristics.

      For the mechanisms of protein folding at different stages are different in general nature, and may require different kinds of mathematical methods for its prediction.

      Through the mechanisms of protein folding, we will able to discover the general principles that govern the folding process of protein as a long chain molecule; the main feature that distinguish the native structure from the denature structure; the guide line for predicting the native structure of a protein from the amino acid sequence.

      Protein folding is a new area for applied mathematics; long term study is required understand the mechanisms for the folding process of a protein. The main questions are outlined as following:

      1. The word “native structure” is used biologically to refer the 3D structure of a protein at physiological condition. However, the exact definition of the “native structure” is not known. Mathematically, we had to clarify that the “native structure” is not a single unique point in the phase space, but a ensemble of the conformations that share the same key features to enable the biological function. Therefore, it remain a difficult question how to describe the “key features” by the macroscopic properties of the stricture of a protein. This question seeks to give the definition of “native structure”. Obviously, to prediction the “native structure” of a protein, one need to define what is the “native structure”.

      2. Protein folding is a specific kinetic process of polymer molecules. Thus, the general principles and methods for polymer dynamics[5] is applicable to investigate the dynamical properties of a protein as a long chain molecule. Nevertheless, despite the simple models (Rouse model and Zimm model) for the study of polymer dynamics, it is still a difficult challenge to derive the formulation for thedynamics of a polymer that away from the equilibrium state.

      3. In addition to the characteristic of long chain molecule, the protein possesses special feature that there are local regular structures (the secondly structures elements). The presence of regular structure elements have significant effects for the dynamical process of protein folding. To clarify the interaction between the regular structure and irregular structure is essential to understand the mechanism of protein folding.

      4. Protein folding is a process of several different steps:

          Unfolded state → Molten globular state → Folded state.

      The processes at different states are dominated by different interactions and should be modelled by different mathematical models.

      5. Protein folding is a complex process with many interactions between the molecules involve. However, the basic mechanisms can be simple from the view point of statistics and do not depends on the detail of the interactions. Thus, in studying the mechanisms, we should focus on the basic principle that do not depends on the detail interaction.

      All graduates with biology and chemistry background are welcome to join our research group and be part of the team!



[1] Akiyama, S., Takahashi, S., et. al., PNAS, 99(2002), 1329-1334. Uzawa, T., Akiyama, S., et. al., PNAS, 2004(101), 1171-1176. Kimura, T., Uzawa, T., et. al., PNAS, 102(2005), 2748-2753.

[2] Anfinsen, C. B., Principles that govern the folding of protein chains, Science, 181(1973), 223-230.

[3] Baker, D., Agard, D. A., Kinetics versus thermodynamics in protein folding, Biochemistry, 33(1994), 7505-7509.

[4] Branden, C., Tooze, J., Introduction to Protein Structure, 2nd ed., Graland Publishing, 1998.

[5] Doi, M., Edwards, S. F., The theroy of polymer dynamics. Clarendon Press, Oxford. 24-32, 1986.

[6] Garrett, R. H., Grisham, C. M., Biochemistry, 2nd ed., Higher Education Press, Beijing, 2002.

[7] Kanehisa, M., Post-Genome Informatics, Oxford University Press, 2001.

[8] Lazaridis, T., Karplus, M., Thermodynamics of protein folding: a microscopic view, Biophys. Chem., 100(2003), 367-395.

[9] Straub, J. E., Protein folding and optimization algorithms, in Encyclopedia of computational chemistry, von Ragu´e Schleyer, P., ed., John Wiley & Sons, Vol 3, 2184-2191.


Research Team: 

Chia-Chiao Lin

AAAS Fellow

Jin-Zhi Lei

Associate Researcher

Wei-Tao Sun

Associate Researcher

Liu Hong