DOI QR코드

DOI QR Code

Extracting the Source Code Context to Predict Import Changes using GPES

  • Lee, Jaekwon (Department of Computer Engineering, Chungbuk National University) ;
  • Kim, Kisub (Department of Computer Engineering, Chungbuk National University) ;
  • Lee, Yong-Hyeon (Neowiz Games) ;
  • Hong, Jang-Eui (Department of Computer Science, Chungbuk National University) ;
  • Seo, Young-Hoon (Department of Computer Engineering, Chungbuk National University) ;
  • Yang, Byung-Do (Department of Electronics Engineering, Chungbuk National University) ;
  • Jung, Woosung (Graduate School of Education, Seoul National University of Education)
  • Received : 2016.09.11
  • Accepted : 2017.02.27
  • Published : 2017.02.28

Abstract

One of the difficulties developers encounter in maintaining tasks of a large-scale software system is the updating of suitable libraries on time. Developers tend to miss or make mistakes when searching for and choosing libraries during the development process, or there may not be a stable library for the developers to use. We present a novel approach for helping developers modify software easily and on time and avoid software failures. Using a tool previously built by us called GPES, we collected information of projects, such as abstract syntax trees, tokens, software metrics, relations, and evolutions, for our experiments. We analyzed the contexts of source codes in existing projects to predict changes automatically and to recommend suitable libraries for the projects. The collected data show that researchers can reduce the overall cost of data analysis by transforming the extracted data into the required input formats with a simple query-based implementation. Also, we manually evaluated how the extracted contexts are similar to the description and we found that a sufficient number of the words in the contexts is similar and it might help developers grasp the domain of the source codes easily.

Keywords

References

  1. K. Bennett and V. Rajlich, "Software Maintenance and Evolution: a Roadmap," in Proc. of the Conference on the Future of Software Engineering, pp. 73-87, June 4-11, 2000.
  2. Git - fast version control system. Available from: http://git-scm.com
  3. Apache Subversion. Available from: http://subversion.apache.org
  4. A.T.T. Ying, G.C. Murphy, R. Ng, and M.C. Chu-Carroll, "Predicting source code changes by mining change history," IEEE Transactions on Software Engineering, vol. 30, no. 9, pp. 574-586, September, 2004. https://doi.org/10.1109/TSE.2004.52
  5. H. Zhong, T. Xie, L. Zhang, J. Pei, and H. Mei, "MAPO: Mining and Recommending API Usage Patterns," in Proc. of the 23rd European Conference on ECOOP, pp. 318-343, July 6-10, 2009.
  6. K. Inoue, Y. Sasaki, P. Xia, and Y. Manabe, "Where Does This Code Come from and Where Does It Go? - Integrated Code History Tracker for Open Source Systems -," in Proc. of the 34th Int. Conference on Software Engineering, pp. 331-341, June 2-9, 2012.
  7. D. Lucredio, A. F. Do Prado, and E. S. De Almeida, "A survey on software components search and retrieval," in Proc. of the 30th Euromicro Conference, pp. 152-159, September 3-3, 2004.
  8. L. Page, S. Brin, R. Motwani, and T. Winograd, "The PageRank Citation Ranking: Bringing Order to the Web," World Wide Web Internet Web Information System, vol. 54, no. 2, pp. 1-17, January 29, 1998.
  9. Creating Lists and Cards, Available from: https://developer.android.com/training/material/listscards.html
  10. M. Fowler, Refactoring: Improving the Design of Existing Code, 1st Edition, Addison-Wesley, 1999.
  11. J. Lee, Y. Lee, K. Kim, J. Hong, and W. Jung, "GPES : Supporting Source Code Analysis by Extracting the Evolutionary History of Software Structure and Quality," in Proc. of the 11th Asia Pacific Int. Conference on Information Science and Technology, pp. 43-45, June 26-29, 2016.
  12. G. Maskeri, S. Sarkar, and K. Heafield, "Mining business topics in source code using latent dirichlet allocation," in Proc. of the 1st India Software Engineering Conference, pp. 113-120, February 19-22, 2008.
  13. E. Hill, L. Pollock, and K. V. Shanker, "Automatically capturing source code context of NL-queries for software maintenance and reuse," in Proc. of the 31st Int. Conference on Software Engineering, pp. 232-242, May 16-24, 2009.
  14. A. Kuhn, S. Ducasseb, and T. Girbaa, "Semantic clustering: Identifying topics in source code," Information and Software Technology, vol. 49, no. 3, pp. 230-243, March 2007. https://doi.org/10.1016/j.infsof.2006.10.017
  15. P. W. McBurney, and C. McMillan, "Automatic Source Code Summarization of Context for Java Methods," IEEE Transactions on Software Engineering, vol. 42, no. 02, pp. 103-119, February, 2016. https://doi.org/10.1109/TSE.2015.2465386
  16. Y. Lee, K. Kim, and Woosung Jung, "Analyzing Developer's Share of Code Based on Abstract Syntax Tree," in Proc. of the Korean Society of Computer Information Conference, Vol. 23, No. 2, pp. 23-24, July 9-11, 2015.
  17. Y. Tao, Y. Dang, T. Xie, D. Zhang, and S. Kim "How do software engineers understand code changes?: an exploratory study in industry," in Proc. of the 20th Int. Symposium on the Foundations of Software Engineering, pp. 51:1-51:11 November 11-16, 2012.
  18. H.A. Nguyen, A.T. Nguyen, T.T. Nguyen, T.N. Nguyen, and H. Rajan, "A study of repetitiveness of code changes in software evolution," in Proc. of the 28th Int. Conference on Automated Software Engineering, pp. 180-190, November 11-15, 2013.
  19. S. Negara, M. Codoban, D. Dig, and R.E. Johnson, "Mining fine-grained code changes to detect unknown change patterns," in Proc. of the 36th Int. Conference on Software Engineering, pp. 803-813, May 31-June 7, 2014.
  20. F. Thung, D. Lo, and J. Lawall, "Automated library recommendation," in Proc. of the 20th Working Conference on Reverse Engineering, pp. 182-191, October 14-17, 2013.
  21. A. Kalia and S. Sood, "Characterization of Reusable Software Components for Better Reuse," Int. Journal of Research in Engineering and Technology, Vol. 03, No. 05, May 2014.
  22. M. Aziz and S. North, "Retrieving software component using clone detection and program slicing," Sheffield, UK: The University of Sheffield, February 2007.
  23. E. Enslen, E. Hill, and L. Pollock, "Mining Source Code to Automatically Split Identifiers for Software Analysis," in Proc. of the 6th Int. Working Conference on Mining Software Repositories, pp. 71-80, May 16-17, 2009.
  24. R. Dyer, H. A. Nguyen, H. Rajan, and T. N. Nguyen, "Boa: A Language and Infra- structure for Analyzing Ultra-Large-Scale Software Repositories," in Proc. of the 35th Int. Conference on Software Engineering, pp. 422-431, May 18-26, 2013.
  25. A. Capiluppi, M. Morisio, and J. F. Ramil, "Structural evolution of an open source system: a case study," in Proc. of the 12th Int. Workshop on Program Comprehension, pp. 172-182, June 26-26, 2004.
  26. T. Zimmermann, A. Zeller, P. WeiBgerber, S. Diehl, "Mining version histories to guide software changes," IEEE Transactions on Software Engineering, vol. 31, no. 6, pp. 429-445, June, 2005. https://doi.org/10.1109/TSE.2005.72
  27. J. I. Maletic, M. L. Collard, "Supporting source code difference analysis," in Proc. of the 20th Int. Conference on Software Maintenance, pp. 210-219, September 11-14, 2004.