JOURNAL BROWSE
Search
Advanced SearchSearch Tips
Pre-Processing of Query Logs in Web Usage Mining
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
Pre-Processing of Query Logs in Web Usage Mining
Abdullah, Norhaiza Ya; Husin, Husna Sarirah; Ramadhani, Herny; Nadarajan, Shanmuga Vivekanada;
  PDF(new window)
 Abstract
In For the past few years, query log data has been collected to find user's behavior in using the site. Many researches have studied on the usage of query logs to extract user's preference, recommend personalization, improve caching and pre-fetching of Web objects, build better adaptive user interfaces, and also to improve Web search for a search engine application. A query log contain data such as the client's IP address, time and date of request, the resources or page requested, status of request HTTP method used and the type of browser and operating system. A query log can offer valuable insight into web site usage. A proper compilation and interpretation of query log can provide a baseline of statistics that indicate the usage levels of website and can be used as tool to assist decision making in management activities. In this paper we want to discuss on the tasks performed of query logs in pre-processing of web usage mining. We will use query logs from an online newspaper company. The query logs will undergo pre-processing stage, in which the clickstream data is cleaned and partitioned into a set of user interactions which will represent the activities of each user during their visits to the site. The query logs will undergo essential task in pre-processing which are data cleaning and user identification.
 Keywords
Pre-Processing;Web Log;Web Usage Mining;
 Language
English
 Cited by
 References
1.
Batista, P., Silva, M. J., Silva, M., and Grande, C. (2002), Mining On-line Newspaper Web Access Logs, Proceedings of the AH'2002 Workshop on Recommendation and Personalization in eCommerce, 100-108.

2.
Choa, Y. H., Kim, J. K., and Kima, S. H. (2002), A personalized recommender system based on web usage mining and decision tree induction, Expert Systems with Applications, 23, 329-342. crossref(new window)

3.
Cooley, R., Mobasher, B., and Srivastava, J. (1999), Data Preparation for Mining World Wide Web Browsing Patterns, Knowledge and Information Systems, 1(1), 5-32. crossref(new window)

4.
Dixit, D. and Gadge, J. (2010), Automatic Recommendation for Online Users Using Web Usage Mining, International Journal of Managing Information Technology (IJMIT), 2, 33-42. crossref(new window)

5.
Elsheikh, S. (2008), Web Usage Data for Web Access Control (WUDWAC), Proceedings of the World Congress on Engineering.

6.
Hao, T., Brimmer, D. J., Lin, J. M. S., Tumpey, A. J. and Reeves, W. C. (2009), Web Usage Data as a Means of Evaluating Public Health Messaging and Outreach, Journal of Medical Internet Research, 11, 99-118.

7.
Vellingiri, J. S. And Pandian, C. (2011), A Survey on Web Usage Mining, Global Journal Of Computer Science and Technology, 1, 4343-4350.

8.
Kumari, V. V. and Raju, K. S. (2010), Understanding User Behavior using Web Usage Mining, International Journal of Computer Applications, 7, 162-286.

9.
Markellou, P., Rigou, M., and Sirmakessis, S. (2005), Mining for Web Personalization, in Scime, A. (Ed.) Web Mining: Applications and Techniques, London: Idea Group Publishing, 27-48.

10.
Mobasher, B., Dai, H., Luo, T., Sun, Y., and Zhu, J. (2000), Integrating web usage and content mining for more effective personalization, Proceedings of the First International Conference on Electronic Commerce and Web Technologies, LNCS, 1875, 165-176.

11.
Murgue, T. and Jaillon, P. (2005), Data Preparation and Structural Models for Web Usage Mining, SETIT International Conference: Sciences of Electronic, Technologies of Information and Telecommunication.

12.
Nicholas, D., Huntington, P., Williams, P., and Dobrowolski, T. (2004), Reappraising information seeking behavior in a digital environment, Documentation, 60(1), 24-43. crossref(new window)

13.
Pitkow, J. (1997), In search of reliable usage data on the WWW, Sixth International World Wide Web Conference, 451-463.

14.
Srivastava, J., Cooley, R., Deshpande, M., and Tan, P. N. (2000), Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data, ACM SIGKDD, 1(2), 12-23. crossref(new window)

15.
Sanjay, B. and Thakare, S. (2010), A effective and complete preprocessing for Web Usage Mining, IJCSE International Journal on Computer Science and Engineering, 2(3), 848-851.

16.
Status codes (2011), Available at http://www.w3.org/Protocols/HTTP/HTRESP.html.

17.
Tanasa, D. and Trousse, B. (2004), Advanced Data Preprocessing for Intersites Web Usage Mining. IEEE Intelligent Systems, 19(2), 59-65. crossref(new window)

18.
Tyagi, N. K., Solanki, A. K., and Wadhwa, M. (2010), Analysis of Server Log by Web Usage Mining for Website Improvement, International Journal of Computer Science Issues, 7(4-8), 17-21.