DOI QR코드

DOI QR Code

Accelerating Group Fusion for Ligand-Based Virtual Screening on Multi-core and Many-core Platforms

  • 투고 : 2016.04.12
  • 심사 : 2016.11.21
  • 발행 : 2016.12.31

초록

The performance issues of screening large database compounds and multiple query compounds in virtual screening highlight a common concern in Chemoinformatics applications. This study investigates these problems by choosing group fusion as a pilot model and presents efficient parallel solutions in parallel platforms, specifically, the multi-core architecture of CPU and many-core architecture of graphical processing unit (GPU). A study of sequential group fusion and a proposed design of parallel CUDA group fusion are presented in this paper. The design involves solving two important stages of group fusion, namely, similarity search and fusion (MAX rule), while addressing embarrassingly parallel and parallel reduction models. The sequential, optimized sequential and parallel OpenMP of group fusion were implemented and evaluated. The outcome of the analysis from these three different design approaches influenced the design of parallel CUDA version in order to optimize and achieve high computation intensity. The proposed parallel CUDA performed better than sequential and parallel OpenMP in terms of both execution time and speedup. The parallel CUDA was 5-10x faster than sequential and parallel OpenMP as both similarity search and fusion MAX stages had been CUDA-optimized.

키워드

참고문헌

  1. J. Gasteiger, "Chemoinformatics: a new field with a long tradition," Analytical and Bioanalytical Chemistry, vol. 384, no. 1, pp. 57-64, 2006. https://doi.org/10.1007/s00216-005-0065-y
  2. W. P. Walters, M. T. Stahl, and M. A. Murcko, "Virtual screening: an overview," Drug Discovery Today, vol. 3, no. 4, pp. 160-178, 1998. https://doi.org/10.1016/S1359-6446(97)01163-X
  3. P. Willett, J. M. Barnard, and G. M. Downs, "Chemical similarity searching," Journal of Chemical Information and Computer Sciences, vol. 38, no. 6, pp. 983-996, 1998. https://doi.org/10.1021/ci9800211
  4. J. Hert, P. Willett, D. J. Wilton, P. Acklin, K. Azzaoui, E. Jacoby, and A. Schuffenhauer, "Enhancing the effectiveness of similarity-based virtual screening using nearest-neighbor information," Journal of Medicinal Chemistry, vol. 48, no. 22, pp. 7049-7054, 2005. https://doi.org/10.1021/jm050316n
  5. P. Willett, "Combination of similarity rankings using data fusion," Journal of Chemical Information and Modeling, vol. 53, no. 1, pp. 1-10, 2013. https://doi.org/10.1021/ci300547g
  6. J. Hert, P. Willett, D. J. Wilton, P. Acklin, K. Azzaoui, E. Jacoby, and A. Schuffenhauer, "Comparison of fingerprint-based methods for virtual screening using multiple bioactive reference structures," Journal of Chemical Information and Computer Sciences, vol. 44, no. 3, pp. 1177-1185, 2004. https://doi.org/10.1021/ci034231b
  7. M. Whittle, V. J. Gillet, P. Willett, A. Alex, and J. Loesel, "Enhancing the effectiveness of virtual screening by fusing nearest neighbor lists: a comparison of similarity coefficients," Journal of Chemical Information and Computer Sciences, vol. 44, no. 5, pp. 1840-1848, 2004. https://doi.org/10.1021/ci049867x
  8. M. Vogt and J. Bajorath, "Chemoinformatics: a view of the field and current trends in method development," Bioorganic & Medicinal Chemistry, vol. 20, no. 18, pp. 5317-5323, 2012. https://doi.org/10.1016/j.bmc.2012.03.030
  9. I. Sanchez-Linares, H. Perez-Sanchez, and J. M. Garcia, "Accelerating grid kernels for virtual screening on graphics processing units," in Applications, Tools, and Techniques on the Road to Exascale Computing. Amsterdam: IOS Press, 2001, pp. 413-420.
  10. M. Maggioni, M. D. Santambrogio, and J. Liang, "GPU-accelerated chemical similarity assessment for large scale databases," Procedia Computer Science, vol. 4, pp. 2007-2016, 2011. https://doi.org/10.1016/j.procs.2011.04.219
  11. R. Lambert, "An introduction to Derwent World Drug Index," in EuroMug 2000, Cambridge, UK, 2000.
  12. D. S. Wishart, C. Knox, A. C. Guo, S. Shrivastava, M. Hassanali, P. Stothard, Z. Chang, and J. Woolsey, "DrugBank: a comprehensive resource for in silico drug discovery and exploration," Nucleic Acids Research, vol. 34(Suppl 1), pp. D668-D672, 2006. https://doi.org/10.1093/nar/gkj067
  13. C. Southan, P. Varkonyi, and S. Muresan, "Quantitative assessment of the expanding complementarity between public and commercial databases of bioactive compounds," Journal of Cheminformatics, vol. 1, article no. 10, 2009.
  14. S. O. Jonsdottir, F. S. Jorgensen, and S. Brunak, "Prediction methods and databases within chemoinformatics: emphasis on drugs and drug candidates," Bioinformatics, vol. 21, no. 10, pp. 2145-2160, 2005. https://doi.org/10.1093/bioinformatics/bti314
  15. J. Gasteiger and T. Engel, Chemoinformatics: A Textbook. Weinheim: Wiley-VCH, 2003
  16. P. Willett, "Similarity-based virtual screening using 2D fingerprints," Drug Discovery Today, vol. 11, no. 23, pp. 1046-1053, 2006. https://doi.org/10.1016/j.drudis.2006.10.005
  17. D. Rogers and M. Hahn, "Extended-connectivity fingerprints," Journal of Chemical Information and Modeling, vol. 50, no. 5, pp. 742-754, 2010. https://doi.org/10.1021/ci100050t
  18. M. Hassan, R. D. Brown, S. Varma-O'Brien, and D. Rogers, "Cheminformatics analysis and learning in a data pipelining environment," Molecular Diversity, vol. 10, no. 3, pp. 283-299, 2006. https://doi.org/10.1007/s11030-006-9041-5
  19. N. Salim, J. Holliday, and P. Willett, "Combination of fingerprint-based similarity coefficients using data fusion," Journal of Chemical Information and Computer Sciences, vol. 43, no. 2, pp. 435-442, 2003. https://doi.org/10.1021/ci025596j
  20. P. H. Sneath and R. R. Sokal, Numerical Taxonomy: The Principles and Practice of Numerical Classification. San Francisco, CA: W. H. Freeman, 1973.
  21. A. R. Leach and V. J. Gillet, An Introduction to Chemoinformatics. Dordrecht: Springer, 2007.
  22. D. Bajusz, A. Racz, and K. Heberger, "Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?," Journal of Cheminformatics, vol. 7, article no. 20, 2015.
  23. N. Malim, Y. Pei-Chia, and S. M. Arif, "New strategy for turbo similarity searching: implementation and testing," in Proceedings of 2013 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Bali, Indonesia, 2013, pp. 179-184.
  24. A. A. Zainal, N. Yusri, N. Malim, and S. M. Arif, "The influence of similarity measures and fusion rules toward turbo similarity searching," Procedia Technology, vol. 11, pp. 823-833, 2013. https://doi.org/10.1016/j.protcy.2013.12.264
  25. E. J. Otoo and D. Rotem, "Parallel access of out-of-core dense extendible arrays," in Proceedings of 2007 IEEE International Conference on Cluster Computing, Austin, TX, 2007, pp. 31-40.
  26. N. Malim, "Enhancing similarity searching," Ph.D. dissertation, University of Sheffield, UK, 2011.