DOI QR코드

DOI QR Code

Directional adjacency-score function for protein fold recognition

  • Heo, Mu-Young (Creative Research Initiatives Center for Proteome Biophysics, Department of Physics, Pusan National University) ;
  • Cheon, Moo-Kyung (Creative Research Initiatives Center for Proteome Biophysics, Department of Physics, Pusan National University) ;
  • Kim, Suhk-Mann (Department of Chemistry, Pusan National University) ;
  • Chung, Kwang-Hoon (Creative Research Initiatives Center for Proteome Biophysics, Department of Physics, Pusan National University) ;
  • Chang, Ik-Soo (Creative Research Initiatives Center for Proteome Biophysics, Department of Physics, Pusan National University)
  • Published : 2009.06.30

Abstract

Introduction: It is a challenge to design a protein score function which stabilizes the native structures of many proteins simultaneously. The coarse-grained description of proteins to construct the pairwise-contact score function usually ignores the backbone directionality of protein structures. We propose a new two-body score function which stabilizes all native states of 1,006 proteins simultaneously. This two-body score function differs from the usual pairwise-contact functions in that it considers two adjacent amino acids at two ends of each peptide bond with the backbone directionality from the N-terminal to the C-terminal. The score is a corresponding propensity for a directional alignment of two adjacent amino acids with their local environments. Results and Discussion: We show that the construction of a directional adjacency-score function was achieved using 1,006 training proteins with the sequence homology less than 30%, which include all representatives of different protein classes. After parameterizing the local environments of amino acids into 9 categories depending on three secondary structures and three kinds of hydrophobicity of amino acids, the 32,400 adjacency-scores of amino acids could be determined by the perceptron learning and the protein threading. These could stabilize simultaneously all native folds of 1,006 training proteins. When these parameters are tested on the new distinct 382 proteins with the sequence homology less than 90%, 371 (97.1%) proteins could recognize their native folds. We also showed using these parameters that the retro sequence of the SH3 domain, the B domain of Staphylococcal protein A, and the B1 domain of Streptococcal protein G could not be stabilized to fold, which agrees with the experimental evidence.

References

  1. Anfinsen, C.B. (1973). Principles that govern the folding of protein chains. Science 181, 223-230 https://doi.org/10.1126/science.181.4096.223
  2. Baker, D. (2000). A surprising simplicity to protein folding. Nature 405, 39-42 https://doi.org/10.1038/35011000
  3. Bowie, J.U., Luthy, R., Eisenberg, D. (1991). A method to identify protein sequences that fold into a known three-dimensional structure. Science 253, 164-170 https://doi.org/10.1126/science.1853201
  4. Bryant, S.H., Lawrence, C.E. (1993). An empirical energy function for threading protein sequence through folding motif. Proteins Struct. Funct. Genet. 16, 92-112 https://doi.org/10.1002/prot.340160110
  5. Chang, I., Cieplak, M., Dima, R.I., Maritan, A., Banavar, J.R. (2001). Protein threading by learning. Proc. Natl. Acad. Sci. USA 98, 14350-14355 https://doi.org/10.1073/pnas.241133698
  6. Dima, R.I., Banavar, J.R., Maritan, A. (2000). Scoring functions in protein folding and design. Protein Sci. 9, 812-819 https://doi.org/10.1110/ps.9.4.812
  7. Fersht, A.R. (1998). Structure and Mechanism in Protein Science: A Guide to Enzyme Catalysis and Protein Folding. Freeman, New York
  8. Fischer, D., Rice, D.W., Bowie, J.U., Eisenberg, D. (1996) Assigning amino acid sequences to 3D protein folds. FASEB J. 10, 126-136
  9. Friedrichs, M.S., Wolynes, P.G. (1989). Toward Protein Tertiary Structure Recognition by means of Associative Memory Hamiltonians. Science 246, 371-373 https://doi.org/10.1126/science.246.4928.371
  10. Godzik, A., Kolinski, A., Skolnick, J. (1992). Topology fingerprint approach to the inverse protein folding problem. J. Mol. Biol. 227, 227-238 https://doi.org/10.1016/0022-2836(92)90693-E
  11. Goldstein, R.A., Luthey-Schulten, Z.A., Wolynes, P.G. (1992). Protein tertiary structure recognition using optimized Hamiltonians with local interactions. Proc. Natl. Acad. Sci. USA 89, 9029-9033 https://doi.org/10.1073/pnas.89.19.9029
  12. Heo, M., Cheon, M., Chang, I. (2004). Nonsymmetric two-body score function for protein fold recognition; Next nearest neighbor-adjacency of two amino acids. Int. J. Mod. Phys. C, 15, 1087-1094 https://doi.org/10.1142/S0129183104006546
  13. Heo, M., Kim, S., Moon, E.J., Cheon, M., Chung, K., Chang, I. (2005). Perceptron learning of pairwise contact energies for proteins incorporating the amino acid environment. Phys. Rev. E. 72, 011906/1-011906/9 https://doi.org/10.1103/PhysRevE.72.011906
  14. Hobohm, U., Sander, C. (1994). Enlarged representative set of protein structures. Protein Sci. 3, 522-524 https://doi.org/10.1002/pro.5560030317
  15. Hobohm, U., Scharf, M., Schneider, R., Sander, C. (1992). Selection of representative protein data sets. Protein Sci. 1, 409-417 https://doi.org/10.1002/pro.5560010313
  16. Holm, L., Sander, C. (1996). Mapping the Protein Universe. Science 273, 595-602 https://doi.org/10.1126/science.273.5275.595
  17. Jones, D.T., Taylor, W.R., Thonton, J.M. (1992). A new approach to protein fold recognition. Nature 358, 86-89 https://doi.org/10.1038/358086a0
  18. Kolinski, A., Skolnick, J. (1994) a. Monte Carlo simulations of protein folding. I. Lattice model and interaction scheme. Proteins Struct. Funct. Genet. 18, 338-352 https://doi.org/10.1002/prot.340180405
  19. Kolinski, A., Skolnick, J. (1994)b. Monte Carlo simulation of protein folding. II. Application to protein A, ROP and crambin. Proteins Struct. Funct. Genet. 18, 353-366 https://doi.org/10.1002/prot.340180406
  20. Krauth, W., Mezard, M. (1987). Learning algorithms with optimal stability in neural networks. J. Phys. A 20, L745-L752 https://doi.org/10.1088/0305-4470/20/11/013
  21. Lacroix, E., Viguerra, A.R., Serrano, L. (1998). Reading protein sequences backwards. Fold. Des. 3, 79-85 https://doi.org/10.1016/S1359-0278(98)00013-3
  22. Lee, B., Richards, F.M. (1971). The interpretation of protein structures: estimation of static accessibility. J. Mol. Biol. 55, 379-400 https://doi.org/10.1016/0022-2836(71)90324-X
  23. Maiorov, V.N., Crippen, G.M. (1992). Contact potential that recognizes the correct folding of globular proteins. J. Mol. Biol. 227, 876-888 https://doi.org/10.1016/0022-2836(92)90228-C
  24. Mayor, U., Guydosh, N.R., Johnson, C.M., Grossmann, J.G., Sato, S., Jas, G.S., Freund, S.M., Alonso, D.O., Daggett, V., Fersht, A.R. (2003). The complete folding pathway of a protein from nanoseconds to microseconds. Nature 421, 863-867 https://doi.org/10.1038/nature01428
  25. Micheletti, C., Banavar, J.R., Maritan, A., Seno, F. (1998). Steric Constraints in Model Proteins. Phys. Rev. Lett. 80, 5683–5686 https://doi.org/10.1103/PhysRevLett.80.5683
  26. Miyazawa, S., Jernigan, R.L. (1985). Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation. Macromolecules 18, 534-552 https://doi.org/10.1021/ma00145a039
  27. Olszewsk, K.A., Kolinski, A., Skolnick, J. (1996). Does a backwardly read protein sequence have a unique native state? Protein Eng. 9, 5-14 https://doi.org/10.1093/protein/9.1.5
  28. Ouzounis, C., Sander, C., Scharf, M., Schneider, R. (1993). Prediction of protein structure by evaluation of equence-structure fitness. Aligning sequences to contact profiles derived from 3D structures. J. Mol. Biol. 232, 805-823 https://doi.org/10.1006/jmbi.1993.1433
  29. Pattabiraman, N., Ward, K.B., Fleming, P.J. (1995). Occluded molecular surface: analysis of protein packing. J. Mol. Recognit. 8, 334-344 https://doi.org/10.1002/jmr.300080603
  30. Salvi, G., DeLosRios, P. (2003). Effective interactions cannot replace solvent effects in a lattice model of proteins. Phys. Rev. Lett. 91, 258102 https://doi.org/10.1103/PhysRevLett.91.258102
  31. Seno, F., Vendruscolo, M., Maritan, A., Banavar, J.R. (1996). Optimal Protein Design Procedure. Phys. Rev. Lett. 77, 1901-1904 https://doi.org/10.1103/PhysRevLett.77.1901
  32. Sippl, M.J., Weitckus, S. (1992). Detection of native like models for amino acid sequences of unknown three dimensional structure in a data base of known protein conformations. Proteins Struct. Funct. Genet. 13, 258-271 https://doi.org/10.1002/prot.340130308
  33. Vendruscolo, M., Najmanovich, R., and Domany, E. (1999). Protein Folding in Contact Map Space. Phys. Rev. Lett. 82, 656–659 https://doi.org/10.1103/PhysRevLett.82.656
  34. Wilmanns, M., Eisenberg, D. (1993). Three-dimensional profiles from residue pair preferences: Identification of sequences with β/α-barrel fold. Proc. Natl. Acad. Sci. USA 90, 1379-1383 https://doi.org/10.1073/pnas.90.4.1379
  35. Wolynes, P.G., Onuchic, J.N., Thirumalai, D. (1995). Navigating the folding routes. Science 267, 1619-1620 https://doi.org/10.1126/science.7886447
  36. Zhang, C., Kim, S.H. (2000). Environment-dependent residue contact energies for proteins. Proc. Natl. Acad. Sci. USA 97, 2550-2555 https://doi.org/10.1073/pnas.040573597