DOI QR코드

DOI QR Code

Construction of Large Library of Protein Fragments Using Inter Alpha-carbon Distance and Binet-Cauchy Distance

내부 알파탄소간 거리와 비네-코시 거리를 사용한 대규모 단백질 조각 라이브러리 구성

  • Chi, Sang-mun (Department of Computer Science and Engineering, Kyungsung University)
  • Received : 2015.08.21
  • Accepted : 2015.09.22
  • Published : 2015.12.31

Abstract

Representing protein three-dimensional structure by concatenating a sequence of protein fragments gives an efficient application in analysis, modeling, search, and prediction of protein structures. This paper investigated the effective combination of distance measures, which can exploit large protein structure database, in order to construct a protein fragment library representing native protein structures accurately. Clustering method was used to construct a protein fragment library. Initial clustering stage used inter alpha-carbon distance having low time complexity, and cluster extension stage used the combination of inter alpha-carbon distance, Binet-Cauchy distance, and root mean square deviation. Protein fragment library was constructed by leveraging large protein structure database using the proposed combination of distance measures. This library gives low root mean square deviation in the experiments representing protein structures with protein fragments.

단백질의 삼차원 구조를 단백질의 국부적 구조인 단백질 조각의 일차원적 나열로 표현하면, 단백질 구조의 분석, 모델링, 탐색, 예측 등에 효과적으로 응용될 수 있다. 본 논문에서는 자연 상태의 단백질 구조를 정확하게 나타낼 수 있는 단백질 조각 라이브러리를 구성하기 위하여, 대규모 단백질 구조 자료를 이용 할 수 있는 거리 척도들의 효과적인 조합을 조사하였다. 단백질 조각 라이브러리를 구성하기 위해 군집화를 사용하였다. 초기 군집화 단계에서는 가장 계산량이 작은 내부 알파탄소간 거리를 사용하였고, 군집의 확장단계에서는 내부 알파탄소간 거리, 비네-코시거리와 평균 제곱근 오차를 조합하여 사용하였다. 제안한 거리 척도의 조합으로 대규모 자료를 이용하여 단백질 조각 라이브러리를 구성하였다. 구성된 라이브러리를 사용하여 단백질 구조를 나타내는 실험에서 작은 평균 제곱근 오차가 발생함을 확인하였다.

Keywords

References

  1. A. G. de Brevern, C. Etchebest, and S. Hazout, "Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks," Proteins, vol. 41, pp. 271-287, 2000. https://doi.org/10.1002/1097-0134(20001115)41:3<271::AID-PROT10>3.0.CO;2-Z
  2. R. Kolodny, P. Koehl, L. Guibas, and M. Levitt, "Small libraries of protein fragments model native protein structures accurately," Journal of Molecular Biology, vol. 323, pp. 297-307, 2002. https://doi.org/10.1016/S0022-2836(02)00942-7
  3. A. C. Camproux, R. Gautier, and P. Tuffery, "A hidden markov model derived structural alphabet for proteins," Journal of Molecular Biology, vol. 339, pp. 591-605, 2004. https://doi.org/10.1016/j.jmb.2004.04.005
  4. T. Hamelryck, J. T. Kent, and A. Krogh, "Sampling realistic protein conformations using local structural bias," PLoS Comput. Biol. vol. 2, e131, pp. 1121-1133, 2006.
  5. S. C. Li, D. Bu, J. Xu, and M. Li, "Fragment-HMM: A new approach to protein structure prediction,", Protein Science, vol. 17, pp. 1025-1934, 2008. https://doi.org/10.1110/ps.073326608
  6. I. Kalev and M. Habeck, "HHfrag: HMM-based fragment detection using HHpred," Bioinformatics, vol. 27, no. 22, pp. 3110-3116, 2011. https://doi.org/10.1093/bioinformatics/btr541
  7. A. P. Joseph, et al., "A short survey on protein blocks," Biophys. Rev. vol. 2, pp. 137-145, 2010. https://doi.org/10.1007/s12551-010-0036-1
  8. W. Kapsch, "A discussion of the solution for the best rotation to relate two sets of vectors" Acta Crystallog. sect., vol. 34, pp. 827-828, 1978. https://doi.org/10.1107/S0567739478001680
  9. F. Guyon and P. Tuffery, "Fast protein fragment similarity scoring using a Binet-Cauchy kernel," Bioinformatics, vol. 30, no. 6, pp. 784-791, 2014. https://doi.org/10.1093/bioinformatics/btt618
  10. N. K. Fox, S. E. Brenner J. M. Chandonia, "SCOPe: Structural Classification of Proteins-extended, integrating SCOP and ASTRAL data and classification of new structures," Nucl. Acids Res. 42(Database issue), D304-309, 2014. https://doi.org/10.1093/nar/gkt1240