DOI QR코드

DOI QR Code

AllEC: An Implementation of Application for EC Numbers Prediction based on AEC Algorithm

  • Park, Juyeon (Dept. of Computer and Electronic Engineering, Sunmoon University) ;
  • Park, Mingyu (Dept. of Computer and Electronic Engineering, Sunmoon University) ;
  • Han, Sora (Dept. of Life Science and Biochemical Engineering, Graduate School, Sunmoon University) ;
  • Kim, Jeongdong (Div. of Computer Science and Engineering, Sunmoon University) ;
  • Oh, Taejin (Dept. of Life Science and Biochemical Engineering, Graduate School, Sunmoon University) ;
  • Lee, Hyun (Div. of Computer Science and Engineering, Sunmoon University)
  • Received : 2022.04.18
  • Accepted : 2022.06.02
  • Published : 2022.06.30

Abstract

With the development of sequencing technology, there is a need for technology to predict the function of the protein sequence. Enzyme Commission (EC) numbers are becoming markers that distinguish the function of the sequence. In particular, many researchers are researching various methods of predicting the EC numbers of protein sequences based on deep learning. However, as studies using various methods exist, a problem arises, in which the exact prediction result of the sequence is unknown. To solve this problem, this paper proposes an All Enzyme Commission (AEC) algorithm. The proposed AEC is an algorithm that executes various prediction methods and integrates the results when predicting sequences. This algorithm uses duplicates to give more weights when duplicate values are obtained from multiple methods. The largest value, among the final prediction result values for each method to which the weight is applied, is the final prediction result. Moreover, for the convenience of researchers, the proposed algorithm is provided through the AllEC web services. They can use the algorithms regardless of the operating systems, installation, or operating environment.

Keywords

Acknowledgement

This research was supported by the BK21 FOUR (Fostering Outstanding Universities for Research) funded by the Ministry of Education(MOE, Korea) and National Research Foundation of Korea(NRF)

References

  1. Kanehisa, M., 2000 KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Research 28: 27-30., doi: 10.1093/nar/28.1.27
  2. Sohngen, C., A. Chang, and D. Schomburg, 2011 Development of a classification scheme for disease-related enzyme information. BMC Bioinformatics 12: 329., doi: 10.1186/1471-2105-12-329
  3. Rose, P. W., A. Prli'c, A. Altunkaya, C. Bi, A. R. Bradley, et al., 2016 42 The RCSB protein data bank: integrative view of protein, gene 43 and 3D structural information. Nucleic Acids Research 45: D271- 44 D281., doi: 10.1093/nar/gkw1000
  4. Cornish-Bowden, A., 2014 Current iubmb recommendations on enzyme nomenclature and kinetics. Perspectives in Science 1: 74-87, Reporting Enzymology Data - STRENDA Recommendations and Beyond., doi: 10.1016/j.pisc.2014.02.006
  5. Webb, E. C., I. U. of Biochemistry, and M. Biology, 1992 Enzyme 63 nomenclature 1992. Recommendations of the Nomenclature Committee 64 of the International Union of Biochemistry and Molecular Biology on 65 the Nomenclature and Classification of Enzymes, Number Ed. 6, 66 Academic Press, San Diego, California, USA.
  6. Li, Y., S. Wang, R. Umarov, B. Xie, M. Fan et al., 2017 DEEPre: sequence-based enzyme EC number prediction by deep learning (J. Hancock, Ed.). Bioinformatics 34: 760-769., doi: 10.1093/bioinformatics/btx680
  7. Roy, A., J. Yang, and Y. Zhang, 2012 COFACTOR: an accurate comparative algorithm for structure-based protein function annotation. Nucleic Acids Research 40: W471-W477., doi: 10.1093/nar/gks372
  8. Zhang, C., P. L. Freddolino, and Y. Zhang, 2017 COFACTOR: improved protein function prediction by combining structure, sequence and protein-protein interaction information. Nucleic Acids Res 45: W291-W299., doi: 10.1093/nar/gkx366
  9. Yang, J., R. Yan, A. Roy, D. Xu, J. Poisson et al., 2014 The I-TASSER Suite: protein structure and function prediction. Nature Methods 12: 7-8., doi: 10.1038/nmeth.3213
  10. Kumar, N., and J. Skolnick, 2012 EFICAz2.5: application of a high-precision enzyme function predictor to 396 proteomes. Bioinformatics 28: 2687-2688., doi: 10.1093/bioinformatics/bts510
  11. Arakaki, A. K., Y. Huang, and J. Skolnick, 2009 EFICAz2: enzyme function inference by a combined approach enhanced by machine learning. BMC Bioinformatics 10: 107., doi: 10.1186/1471-2105-10-107
  12. Tian, W., A. K. Arakaki, and J. Skolnick, 2004 EFICAz: a comprehensive approach for accurate genome-scale enzyme function inference. Nucleic Acids Research 32: 6226-6239., doi: 10.1093/nar/gkh956
  13. Xiao, X., L.-W. Duan, G.-F. Xue, G. Chen, P. Wang et al., 2020 MF-EFP: Predicting Multi-Functional Enzymes Function Using Improved Hybrid Multi-Label Classifier. IEEE Access 8: 50276-50284., doi: 10.1109/ACCESS.2020.2979888.
  14. Ryu, J. Y., H. U. Kim, and S. Y. Lee, 2019 Deep learning enables high-quality and high-throughput prediction of enzyme commission numbers. Proceedings of the National Academy of Sciences 116: 13996-14001., doi: 10.1073/pnas.1821905116
  15. Sureyya Rifaioglu, A., T. Dogan, M. Jesus Martin, R. Cetin-Atalay, and V. Atalay, 2019 DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks. Scientific Reports 9:, doi: 10.1038/s41598-019-43708-3
  16. Hung, S. S., J. Wasmuth, C. Sanford, and J. Parkinson, 2010 DETECT-a Density Estimation Tool for Enzyme ClassificaTion and its application to Plasmodium falciparum. Bioinformatics 26: 1690-1698., doi: 10.1093/bioinformatics/btq266
  17. Nursimulu, N., L. L. Xu, J. D. Wasmuth, I. Krukov, and J. Parkinson, 2018 Improved enzyme annotation with EC-specific cutoffs using DETECT v2 (J. Hancock, Ed.). Bioinformatics 34: 3393-3395., doi: 10.1093/bioinformatics/bty368
  18. Dalkiran, A., A. S. Rifaioglu, M. J. Martin, R. Cetin-Atalay, V. Atalay et al., 2018 ECPred: a tool for the prediction of the enzymatic functions of protein sequences based on the EC nomenclature. BMC Bioinformatics 19:334., doi: 10.1186/s12859-018-2368-y
  19. Xu, J., H. Zhang, J. Zheng, P. Dovoedo, and Y. Yin, 2019 eCAMI: simultaneous classification and motif identification for enzyme annotation (J. Xu, Ed.). Bioinformatics 36: 2068-2075., doi: 10.1093/bioinformatics/btz908
  20. Cock, P. J. A., T. Antao, J. T. Chang, B. A. Chapman, C. J. Cox et al., 2009 Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25: 1422-1423., doi: 10.1093/bioinformatics/btp163
  21. Johnson, M., I. Zaretskaya, Y. Raytselis, Y. Merezhuk, S. McGinnis et al., 2008 NCBI BLAST: a better web interface. Nucleic Acids Research 36: W5-W9., doi: 10.1093/nar/gkn201