DOI QR코드

DOI QR Code

Research on Data Acquisition Strategy and Its Application in Web Usage Mining

웹 사용 마이닝에서의 데이터 수집 전략과 그 응용에 관한 연구

  • Ran, Cong-Lin (Department of Information Technology Center, Jiujiang University) ;
  • Joung, Suck-Tae (Department of Computer and Software Engineering, Wonkwang University)
  • Received : 2019.05.06
  • Accepted : 2019.06.11
  • Published : 2019.06.30

Abstract

Web Usage Mining (WUM) is one part of Web mining and also the application of data mining technique. Web mining technology is used to identify and analyze user's access patterns by using web server log data generated by web users when users access web site. So first of all, it is important that the data should be acquired in a reasonable way before applying data mining techniques to discover user access patterns from web log. The main task of data acquisition is to efficiently obtain users' detailed click behavior in the process of users' visiting Web site. This paper mainly focuses on data acquisition stage before the first stage of web usage mining data process with activities like data acquisition strategy and field extraction algorithm. Field extraction algorithm performs the process of separating fields from the single line of the log files, and they are also well used in practical application for a large amount of user data.

JBJTBH_2019_v12n3_231_f0001.png 이미지

Fig. 1. The process of web log mining

JBJTBH_2019_v12n3_231_f0002.png 이미지

Fig. 3. Data acquisition procedure through web logs

JBJTBH_2019_v12n3_231_f0003.png 이미지

Fig. 4. Execution steps of ODBC log data acquisition Strategy

JBJTBH_2019_v12n3_231_f0004.png 이미지

Fig. 5. The process of buried point data acquisition

JBJTBH_2019_v12n3_231_f0005.png 이미지

Fig. 6. Data acquisition procedure through packet sniffer

JBJTBH_2019_v12n3_231_f0006.png 이미지

Fig. 7. Code Snippet of the Buried Point

JBJTBH_2019_v12n3_231_f0007.png 이미지

Fig. 8. JS Code of the Script File ma.js

JBJTBH_2019_v12n3_231_f0009.png 이미지

Fig. 9. Data Storage Architecture

JBJTBH_2019_v12n3_231_f0010.png 이미지

Fig. 10. Data Analysis Model

JBJTBH_2019_v12n3_231_f0011.png 이미지

Fig. 2. The process of data acquisition

Acknowledgement

Supported by : Education Department of Jiangxi Province, National Social Science Foundation of China

References

  1. World Internet Users and 2019 Population Stats, https://www.internetworldstats.com/stats.htm 2019.03.
  2. Intelligent Information push-pull Technology, https://baike.baidu.com/item/%E6%99%BA%E8%83%BD%E4%BF%A1%E6%81%AF%E6%8E%A8%E6%8B%89%E6%8A%80%E6%9C%AF/8266146, 2019.03.
  3. M. S. Chen, J. S. Park, K. S. Hong, P. S. Yu, "Efficient Data Mining for Path Traversal Patterns", Proc. of the IEEE International Conference on Knowledge and Data Engineering, pp. 209-220, March, 1998.
  4. H. Mannila, H. Toivonen, A. I. Verkamo, "Discovery of Frequent Episodes in Event Sequences", Proc. of the IEEE International Conference on Data Mining and Knowledge Discovery, pp. 259-289, 1997.
  5. T. W. Yan, M. Jacobsen, H. G. Molina, U. Dayal, "From User Access Patterns to Dynamic Hypertext Linking", Proc. of 5th Internationl World Wide Web Conference, 1996.
  6. X. F. Xu, "Key Classification Mining Algorithms for Massive Data", pp. 35-51, 2010.
  7. M. L. Liu, X. F. Li, T. Snu, "Survey of Data Mining Technology Standards", Computer Science. Vol. 35, pp. 8-10, 2008.
  8. S. P. Singh, Meenu, "Analysis of Web Site Using Web Log Expert Tool Based on Web Data Mining", Proc. of International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), pp. 63-68, March, 2017.
  9. Web mining, https://en.wikipedia.org/wiki/Web_mining, 2019.03.
  10. J. Zhang, Z. Q. Shi, "User Characteristics Analysis Based On Web Log Mining", Proc. of 7th International Conference on BioMedical Engineering and Informatics, pp. 863-866, Oct.,2014.