QSPR model for the boiling point of diverse organic compounds with applicability domain

다양한 유기화합물의 비등점 예측을 위한 QSPR 모델 및 이의 적용구역

  • Received : 2015.06.18
  • Accepted : 2015.06.30
  • Published : 2015.08.25


Boiling point (BP) is one of the most fundamental physicochemical properties of organic compounds to characterize and identify the thermal characteristics of target compounds. Previously developed QSPR equations, however, still had some limitation for the specific compounds, like high-energy molecules, mainly because of the lack of experimental data and less coverage. A large BP dataset of 5,923 solid organic compounds was finally secured in this study, after dedicated pre-filtration of experimental data from different sources, mostly consisting of compounds not only from common organic molecules but also from some specially used molecules, and those dataset was used to build the new BP prediction model. Various machine learning methods were performed for newly collected data based on meaningful 2D descriptor set. Results of combined check showed acceptable validity and robustness of our models, and consensus approaches of each model were also performed. Applicability domain of BP prediction model was shown based on descriptor of training set.


Boiling point;QSPR;machine learning;applicability domain


  1. DIPPR(Design Institute for Physical Property Data), American Institute of Chemical Engineers, New Mexico, 2015.
  2. PreADMET Ver., BMDRC, Seoul, Korea, 2007.
  3. RapidMiner Ver.5.3.015, Rapid-I, Stockumer, Germany, 2014.
  4. A. Tropsha, P. Gramatica and V. K. Gombar, QSAR Combi. Sci., 22(1), 69-77 (2003).
  5. B. Moller, J. Rarey and D. Ramjugernath, J. Mol. Liq., 143(1), 52-63 (2008).
  6. R. Ceriani, R. Gani and A. J. A. Meirelles, Fluid Phase Equilib., 283(1), 49-55 (2009).
  7. J. Marrero and R. Gani, Fluid Phase Equilib., 183, 183-208 (2001).
  8. Y. S. Park, J. H. Lee, H W. Park and S. K. Lee, Anal. Sci. Tech., 28(3), 187-195 (2015).
  9. Physical/Chemical Property Database (PHYSPROP), SRC Environmental Science Center, Syracuse Research Corporation, Syracuse, New York, 1994-2015.
  10. H. Bathelt, F. Volk, and M. Weindel, The ICT-Database of Thermochemical Values, 8th update, Fraunhofer-Institut für Chemische Technologie (ICT), Pfinztal, Germany, 2008.