DOI QR코드

DOI QR Code

Online Reviews Analysis for Prediction of Product Ratings based on Topic Modeling

토픽 모델링에 기반한 온라인 상품 평점 예측을 위한 온라인 사용 후기 분석

  • Received : 2017.04.28
  • Accepted : 2017.06.23
  • Published : 2017.09.30

Abstract

Customers have been affected by others' opinions when they make a purchase. Thanks to the development of technologies, people are sharing their experiences such as reviews or ratings through online or social network services, However, although ratings are intuitive information for others, many reviews include only texts without ratings. Also, because of huge amount of reviews, customers and companies can't read all of them so they are hard to evaluate to a product without ratings. Therefore, in this study, we propose a methodology to predict ratings based on reviews for a product. In a methodology, we first estimate the topic-review matrix using the Latent Dirichlet Allocation technic which is widely used in topic modeling. Next, we predict ratings based on the topic-review matrix using the artificial neural network model which is based on the backpropagation algorithm. Through experiments with actual reviews, we find that our methodology can predict ratings based on customers' reviews. And our methodology performs better with reviews which include certain opinions. As a result, our study can be used for customers and companies that want to know exactly a product with ratings. Moreover, we hope that our study leads to the implementation of future studies that combine machine learning and topic modeling.

Keywords

References

  1. Andrew, P.B., "The Use of the Area Under the ROC Curve in the Evaluation of Machine Learning Algorithms", Pattern Recognition, Vol.30, No.7, 1997, 1145-1159. https://doi.org/10.1016/S0031-3203(96)00142-2
  2. Bisgin, H., Z. Liu, H. Fang, X. Xu, and W. Tong, "Mining FDA Drug Labels Using An Unsupervised Learning Technique-Topic Modeling", BMC Bioinformatics, Vol.12, No.10, 2011, 1-8. https://doi.org/10.1186/1471-2105-12-1
  3. Blei, D.M. and J. Lafferty, Text Mining : Classification, Clustering, and Applications, Chapman and Hall/CRC, 2009.
  4. Braun, L., F. Wiesman, and J. van den Herik, "Towards Automatic Formulation of A Physician's Information Needs", In Information Retrieval Workshop, 2005.
  5. Chae, S.H., J.I. Lim, and J.Y. Kang, "A Comparative Analysis of Social Commerce and Open Market Using User Reviews in Korean Mobile Commerce", Journal of Intelligence and Information Systems, Vol.21, No.4, 2015, 53-77. (채승훈, 임재익, 강주영, "사용자 리뷰를 통한 소셜커머스와 오픈마켓의 이용경험 비교분석", 지능정보연구, 제21권, 제4호, 2015, 53-77.) https://doi.org/10.13088/JIIS.2015.21.4.053
  6. Chatterjee, P., "Online Reviews : Do Consumers use Them?", ACR 2001 Proceedings, 2001, 129-134.
  7. Chevalier, J.A. and D. Mayzlin, "The Effect of Word of Mouth on Sales : Online Book Reviews", Journal of Marketing Research, Vol.43, No.3, 2006, 345-354. https://doi.org/10.1509/jmkr.43.3.345
  8. Cui, G., H.K. Lui, and X. Guo, "The Effect of Online Consumer Reviews on New Product Sales", International Journal of Electronic Commerce, Vol.17, No.1, 2012, 39-58. https://doi.org/10.2753/JEC1086-4415170102
  9. Hong, L. and B.D. Davison, "Empirical Study of Topic Modeling in Twitter", In Proceedings of the First Workshop on Social Media Analytics, 2010, 80-88.
  10. Huang, Z., H. Chen, C.J. Hsu, W.H. Chen, and S. Wu, "Credit Rating Analysis with Support Vector Machines and Neural Networks : A Market Comparative Study", Decision Support Systems, Vol.37, No.4, 2004, 543-558. https://doi.org/10.1016/S0167-9236(03)00086-1
  11. Hwang, Y., "Facilitating Web Service Taxonomy Generation : An Artificial Neural Network based Framework", Journal of Intelligence and Information Systems, Vol.16, No.2, 2010, 33-54.
  12. Jo, H.K., I.G. Han, and H.Y. Lee, "Bankruptcy Prediction Using Case-Based Reasoning, Neural Networks, and Discriminant Analysis", Expert Systems with Applications, Vol.13, No.2, 1997, 97-108. https://doi.org/10.1016/S0957-4174(97)00011-0
  13. Kang, B.I., M. Song, and H.S. Jho, "A Study on Opinion Mining of Newspaper Texts based on Topic Modeling", Journal of the Korean Library and Information Science Society, Vol.47, No.4, 2013, 315-334. (강범일, 송 민, 조화순, "토픽 모델링을 이용한 신문자료의 오피니언 마이닝에 대한 연구", 한국문헌정보학회지, 제47권, 제4호, 2013, 315-334.)
  14. Kang, J.E. and M.J. Lee, "Analysis of Urban Infrastructure Risk Areas to Flooding using Neural Network in Seoul", Journal of the Korean Society of Civil Engineers, Vol.35, No.4, 2015, 997-1006. (강정은, 이명진, "인공신경망을 활용한 서울시 도시기반시설 침수위험지역 분석", 대한토목학회논문집, 제35권, 제4호, 997-1006.) https://doi.org/10.12652/Ksce.2015.35.4.0997
  15. Krizhevsky, A., I. Sutskever, and G.E. Hinton, "Imagenet Classification with Deep Convolutional Neural Networks", Advances in Neural Information Processing Systems, 2012, 1097-1105.
  16. Lee, J., D.H. Park, and I. Han, "The Effect of Negative Online Consumer Reviews on Product Attitude : An Information Processing View", Electronic Commerce Research and Applications, Vol.7, No.3, 2008, 341-352. https://doi.org/10.1016/j.elerap.2007.05.004
  17. Lu, Y. and C. Zhai, "Opinion Integration through Semi-Supervised Topic Modeling", In Proceedings of the 17th International Conference on World Wide Web, 2008, 121-130.
  18. Minka, T. and J. Lafferty, "Expectation-Propagation for The Generative Aspect Model", In Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence, 2002, 352-359.
  19. Pang, B. and L. Lee, "Seeing Stars : Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales", In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, 2005, 115-124.
  20. Park, D.H., J. Lee, and I. Han, "The Effect of On-line Consumer Reviews on Consumer Purchasing Intention : The Moderating Role of Involvement", International Journal of Electronic Commerce, Vol.11, No.4, 2007, 125-148. https://doi.org/10.2753/JEC1086-4415110405
  21. Sim, H.G. and S.K. Kim, "A Study on Forecasting The Operational Continuous Ability in Battalion Defensive Operations using Artificial Neural Network", Journal of Intelligence and Information Systems, Vol.14, No.3, 2008, 25-39. (심홍기, 김승권, "인공신경망을 이용한 대대전투간 작전지속능력 예측", 지능정보연구, 제14권, 제3호, 2008, 25-39.)
  22. Song, Y., S. Pan, S. Liu, M.X. Zhou, and W. Qian, "Topic and Keyword Re-Ranking for LDAbased Topic Modeling", In Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009, 1757-1760.
  23. Statistic Korea, Online Shopping in 2016, 2016. (통계청, 온라인쇼핑동향조사, 2016.)
  24. Sun, Z., X. Rao, L. Peng, and D. Xu, "Prediction of Protein Supersecondary Structures Based on the Artificial Neural Network Method", Protein Engineering, Vol.10, No.7, 1997, 763-769. https://doi.org/10.1093/protein/10.7.763
  25. Tam, K.Y. and M.Y. Kiang, "Managerial Applications of Neural Networks : The Case of Bank Failure Predictions", Management Science, Vol.38, No.7, 1992, 926-947. https://doi.org/10.1287/mnsc.38.7.926
  26. Tang, H., S. Tan, and X. Cheng, "A Survey on Sentiment Detection of Reviews", Expert Systems with Applications, Vol.36, No.7, 2009, 10760-10773.
  27. Tsang, A.S. and G. Prendergast, "Is A "star" Worth A Thousand Words? The Interplay between Product-Review Texts and Rating Valences", European Journal of Marketing, Vol.43, No.11/12, 2009, 1269-1280. https://doi.org/10.1108/03090560910989876
  28. Turney, P.D., "Thumbs Up or Thumbs Down? : Semantic Orientation Applied to Unsupervised Classification of Reviews", In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, 2002, 417-424.
  29. Witten, I.H. and E. Frank, Data Mining : Practical Machine Learning Tools and Techniques, Morgan Kaufmann Series in Data Management Systems, 2005.