Development of Classification Model for hERG Ion Channel Inhibitors Using SVM Method

SVM 방법을 이용한 hERG 이온 채널 저해제 예측모델 개발

  • Gang, Sin-Moon (Drug Discovery Division, Bioinformatics and Molecular Design Research Center) ;
  • Kim, Han-Jo (Drug Discovery Division, Bioinformatics and Molecular Design Research Center) ;
  • Oh, Won-Seok (Drug Discovery Division, Bioinformatics and Molecular Design Research Center) ;
  • Kim, Sun-Young (Drug Discovery Division, Bioinformatics and Molecular Design Research Center) ;
  • No, Kyoung-Tai (Department of Biotechnology, Yonsei University) ;
  • Nam, Ky-Youb (Drug Discovery Division, Bioinformatics and Molecular Design Research Center)
  • 강신문 ((사)분자설계연구소 신약개발실) ;
  • 김한조 ((사)분자설계연구소 신약개발실) ;
  • 오원석 ((사)분자설계연구소 신약개발실) ;
  • 김선영 ((사)분자설계연구소 신약개발실) ;
  • 노경태 (연세대학교 생명공학과) ;
  • 남기엽 ((사)분자설계연구소 신약개발실)
  • Published : 2009.12.20


Developing effective tools for predicting absorption, distribution, metabolism, excretion properties and toxicity (ADME/T) of new chemical entities in the early stage of drug design is one of the most important tasks in drug discovery and development today. As one of these attempts, support vector machines (SVM) has recently been exploited for the prediction of ADME/T related properties. However, two problems in SVM modeling, i.e. feature selection and parameters setting, are still far from solved. The two problems have been shown to be crucial to the efficiency and accuracy of SVM classification. In particular, the feature selection and optimal SVM parameters setting influence each other, which indicates that they should be dealt with simultaneously. In this account, we present an integrated practical solution, in which genetic-based algorithm (GA) is used for feature selection and grid search (GS) method for parameters optimization. hERG ion-channel inhibitor classification models of ADME/T related properties has been built for assessing and testing the proposed GA-GS-SVM. We generated 6 different models that are 3 different single models and 3 different ensemble models using training set - 1891 compounds and validated with external test set - 175 compounds. We compared single model with ensemble model to solve data imbalance problems. It was able to improve accuracy of prediction to use ensemble model.


ADME/T;hERG inhibitor;SVM;Genetic algorithm;Classification model


  1. Abbott, G. W.; Sesti, F.; Splawski, I.; Buck, M. E.; Lehmann, M. H.; Timothy, K. W.; Keating, M. T.; Goldstein, S. A. Cell 1999, 97, 175-87
  2. Fermini, B.; Fossa, A. A. Nat. Rev. Drug Discovery 2003, 2, 439-47
  3. Keating, M. T.; Sanguinetti, M. C. Cell 2001, 104, 569-80
  4. Pearlstein, R.; Vaz, R.; Rampe, D. J. Med. Chem. 2003, 46, 2017-2022
  5. Aronov, A. M. Drug Discovery Today 2005, 10, 149-155
  6. Recanatini, M.; Poluzzi, E.; Masetti, M.; Cavalli, A.; De Ponti, F. Med. Res. Rev. 2005, 25, 133-166
  7. Mitcheson, J. S.; Chen, J.; Lin, M.; Culberson, C.; Sanguinetti, M. C. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 12329-12333
  8. Li, Q.; Jorgensen, F. S.; Oprea, T.; Brunak, S.; Tabboureau, O. Mol. Pharm. 2008, 5(1), 117-127
  9. Lucasius, C. B.; Kateman, G. Chemometr. Intell. Lab. 1993, 19, 1-33
  10. Guyon, I.; Weston, J.; Barnhil, S.; Vapnik, V. Mach. Learn. 2002, 46, 389-422
  11. Sutter, J. M.; Kalivas, J. H. Microchem. J. 1993, 47, 60-66
  12. Hsu, C. W.; Chang, C.C.; Lin, C. J. 2003
  13. Vapnik, V. Statistical Learning Theory; Wiley: New York, USA., 1998
  14. Seymour G. J. of the Am. Stat. Ass. 1975, 70, 350
  15. Xue, Y.; li, Z. R.; Yap, C. W.; Sun, L. Z.; Chen, X.; Chen, Y. Z. J. Chem. Inf. Comput. Sci. 2004, 44, 1630-1638
  16. Davis, L. handbook of genetic algorithms Van Nostrand Reinhold New York, USA., 1991
  17. BMDRC, PreADMET 2.0; Seoul, Korea, 2007,
  18. PubChem bioassay database (
  19. Chang, C. C.; Lin, C. J. LIBSVM: A library for support vector machines. Available at:, 2001
  20. Li, Q.; Jorgensen, F. S.; Oprea, T.; Brunak, S.; Taboureau, O. Mol. Pharm. 2008, 5(1), 117-127
  21. Kang. P. Cho. S. Lecture Notes in Computer Science Springer Berlin, Germany, 2006, 4232, 837-846

Cited by

  1. Prediction Models of P-Glycoprotein Substrates Using Simple 2D and 3D Descriptors by a Recursive Partitioning Approach vol.33, pp.4, 2012,
  2. A Development of The Road Surface Decision Algorithm Using SVM(Support Vector Machine) Clustering Methods vol.12, pp.5, 2013,