DOI QR코드

DOI QR Code

A Study on Classifications of Useful Customer Reviews by Applying Text Mining Approach

텍스트 마이닝을 활용한 고객 리뷰의 유용성 지수 개선에 관한 연구

  • Received : 2015.07.20
  • Accepted : 2015.09.29
  • Published : 2015.12.31

Abstract

Customer reviews are one of the important sources for purchase decision makings in online stores. Online stores have tried to provide useful reviews in product pages to customers. To assess the usefulness of customer reviews before other users have voted enough on the reviews, diverse aspects of reviews were utilized in prevous studies. Style and semantic information were utilized in many studies. This study aims to test diverse alogrithms and datasets for identifying a proper classification method and threshold to classify useful reviews. In particular, most researches utilized ratio type helpfulness index as Amazon.com used. However, there is another type of usefulness index utilized in TripAdviser.com or Yelp.com, count type helpfulness index. There was no proper threshold to classify useful reviews yet for count type helpfulness index. This study used reivews and their usefulness votes on restaurnats from Yelp.com to devise diverse datasets and applied text mining approaches to classify useful reviews. Random Forest, SVM, and GLMNET showed the greater values of accuracy than other approaches.

Keywords

References

  1. Breiman, L., "Bagging Predictors", Machine Learning, Vol.24, No.2, 1996, 123-140. https://doi.org/10.1023/A:1018054314350
  2. Breiman, L., "Random Forests", Machine Learning, Vol.45, No.1, 2001, 5-32. https://doi.org/10.1023/A:1010933404324
  3. Cao, Q., W. Duan, and Q. Gan, "Exploring Determinants of Voting for The 'Helpfulness' Online Userreviews : A Text Mining Approach", Decision Support Systems, Vol.50, No.2, 2011, 511-521. https://doi.org/10.1016/j.dss.2010.11.009
  4. Choeh, J.Y., H.J. Lee, and S.J. Park, "A Personalized Approach for Recommending Useful Product Reviews Based on Information Gain", KSII Transactions on Internet and Information Systems, Vol.9, No.5, 2015, 1702-1716. https://doi.org/10.3837/tiis.2015.05.008
  5. David, S. and T. Pinch, "Six Degrees of Reputation : The Use and Abuse of Online Review and Recommendation Systems", First Monday, Vol.11, No.3, 2006, Available at http://dx.doi.org/10.5210/fm.v11i3.1315(Downloaded November 28, 2015).
  6. Dellarocas, C., "The Digitization of Word of Mouth : Promise and Challenges of Online Feedback Mechanisms", Management Science, Vol.49, No.10, 2003, 1407-1424. https://doi.org/10.1287/mnsc.49.10.1407.17308
  7. Dellarocas, C., G. Gao, and R. Narayan, "Are Consumers More Likely to Contribute Online Reviews for Hit or Niche Products?", Journal of Management Information Systems, Vol.27, No.2, 2010, 127-157. https://doi.org/10.2753/MIS0742-1222270204
  8. Friedman, J., T. Hastie, and R. Tibshirani, "Regularization Paths for Generalized Linear Model via Coordinate Descent", Journal of Statistical Software, Vol.33, No.1, 2010, 1-22.
  9. Ghose, A. and P.G. Ipeirotis, "Estimating the Helpfulness and Economic Impact of Product Reviews : Mining Text and Reviewer Characteristics", IEEE Transactions on Knowledge and Data Engineering, Vol.23, No.10, 2011, 1498-1512. https://doi.org/10.1109/TKDE.2010.188
  10. Hatzivassiloglou, V. and K. McKeown, "Predicting Theoretical Orientation of Adjectives", Proceedings of the eight conference on European chapter of the association for computational linguistics, 1997, 174-181.
  11. Hong, E.S., "Early Software Quality Prediction Using Support Vector Machine", Journal of Information Technology Services, Vol.10, No.12, 2011, 235-245.(홍의석, "Support Vector Machine을 이용한 초기 소프트웨어 품질 예측", 한국IT서비스학회지, 제10권, 제12호, 2011, 235-245.)
  12. Jurka, T.P., L. Collingwood, A.E. Boydstun, E. Grossman, and W. van Atteveldt, "R Text Tools : Automatic Text Classification", The R journal, Vol.5, No.1, 2012, 6-12.
  13. Jurka, T.P., "Maxent : An R Package for Low-Memory Multinomial Logistic Regression with Support for Semi-Automated Text Classification", The R Journal, Vol.4, No.1, 2012, 56-59.
  14. Kim, J.H., O.B. Kwon, O.Y. Song, and Y.S. Jin, "Applying Text Mining to Identify Factors Which Affect Likes and Dislikes of Online News Comments", Journal of Information Technology Services, Vol.14, No.2, 2015, 159-176.(김정호, 권오병, 송영은, 진유선, "텍스트 마이닝을 통한 댓글의 공감도 및 비공감도에 영향을 미치는 댓글의 특성 연구", 한국IT서비스학회지, 제14권, 제2호, 2015, 159-176.)
  15. Kim, S.M., P. Pantel, and T. Chklovski, "Automatically Assessing Review Helpfulness", Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2006, 423-430.
  16. Li, X., L.M. Hitt, and J. Zhang, "Product Reviews and Competition in Markets for Repeat Purchase Products", Journal of Management Information Systems, Vol.27, No.4, 2011, 9-42. https://doi.org/10.2753/MIS0742-1222270401
  17. Liaw, A. and M. Wiener, "Classification and Regression by Random Forest", R News, Vol.2, No.3, 2002, 18-22.
  18. Liu, Y., X. Huang, A. An, and X. Yu, "Modeling and Predicting the Helpfulness of Online Reviews", Proceedings of the Eighth IEEE International Conference on Data Mining, 2008, 443-452.
  19. Min, J.H. and Y.C. Lee, "Support Vector Bankruptcy Prediction Model with Optimal Choice of RBF Kernel Parameter Values using Grid Search", Journal of the Korean Operations Research and Management Science Society, Vol.30, No.1, 2005, 55-74.(민재형, 이영찬, "Support Vector Machine을 이용한 부도예측모형의 개발", 한국경영과학회지, 제30권, 제1호, 2005, 55-74.)
  20. Nigam, K., J. Lafferty, and A. McCallum, "Using Maximum Entropy for Text Classification", IJCAI-99 Workshop on Machine Learning for Information Filtering, 1999, 61-67.
  21. Mudambi, S.M. and D. Schuff, "What Makes a Helpful Online Review? A Study of Customer Reviews on Amazon.com", MIS Quarterly, Vol.34, No.1, 2010, 185-200. https://doi.org/10.2307/20721420
  22. Pang, B., L. Lee, and S. Vaithyanathan, "Thumbs Up? : Sentiment Classification Using Machine Learning Techniques", Proceedings of the ACL-02 conference onempirical methods in natural language processing, 2002, 79-86.
  23. Shapire, R., Y. Freund, P. Bartlett, and W. Lee, "Boosting the margin : A New Explanation for The Effectivenss of Voting Methods", Annals of Statistics, Vol.26, No.5, 1998, 1651-1686. https://doi.org/10.1214/aos/1024691352
  24. Vapnik, V., Statistical Learning Theory, Springer, New York, 1998.
  25. Zhang, R. and T. Tran, "An Information Gain-Based Approach for Recommending Useful Product Reviews", Knowledge and Information Systems, Vol.26, No.3, 2011, 419-434. https://doi.org/10.1007/s10115-010-0287-y
  26. Zhu, F. and M. Zhang, "The Influence of Online Consumerreviews on The Demand for Experience Goods : The Case of Video Games", Proceedings of the Twenty-Seventh International Conference on Information Systems, Paper No.25, 2006, Available at http://aisel.aisnet.org/icis2006/25(Downloaded November 25, 2015).