DOI QR코드

DOI QR Code

Finding Rotten Eggs: A Review Spam Detection Model using Diverse Feature Sets

  • Akram, Abubakker Usman (Department of Computer Science, COMSATS Institute of Information Technology) ;
  • Khan, Hikmat Ullah (Department of Computer Science, COMSATS Institute of Information Technology) ;
  • Iqbal, Saqib (Department of Software Engineering, Al-Ain University of Science and Technology) ;
  • Iqbal, Tassawar (Department of Computer Science, COMSATS Institute of Information Technology) ;
  • Munir, Ehsan Ullah (Department of Computer Science, COMSATS Institute of Information Technology) ;
  • Shafi, Dr. Muhammad (Department of Computer Science, Air University)
  • Received : 2018.02.02
  • Accepted : 2018.04.23
  • Published : 2018.10.31

Abstract

Social media enables customers to share their views, opinions and experiences as product reviews. These product reviews facilitate customers in buying quality products. Due to the significance of online reviews, fake reviews, commonly known as spam reviews are generated to mislead the potential customers in decision-making. To cater this issue, review spam detection has become an active research area. Existing studies carried out for review spam detection have exploited feature engineering approach; however limited number of features are considered. This paper proposes a Feature-Centric Model for Review Spam Detection (FMRSD) to detect spam reviews. The proposed model examines a wide range of feature sets including ratings, sentiments, content, and users. The experimentation reveals that the proposed technique outperforms the baseline and provides better results.

Keywords

References

  1. Dadkhah, M., et al., "An overview of phishing attacks and their detection techniques," International Journal of Internet Protocol Technology, 9(4), p. 187-195, 2016. https://doi.org/10.1504/IJIPT.2016.081319
  2. Khan, H.U., et al., "Modelling to identify influential bloggers in the blogosphere: A survey," Computers in Human Behavior, 68, p. 64-82, 2017. https://doi.org/10.1016/j.chb.2016.11.012
  3. Shen, H., et al., "Discovering social spammers from multiple views," Neurocomputing, 225, p. 49-57, 2017. https://doi.org/10.1016/j.neucom.2016.11.013
  4. Moosavi, S.A., et al., "Community detection in social networks using user frequent pattern mining," Knowledge and Information Systems, 51(1), p. 159-186, 2017. https://doi.org/10.1007/s10115-016-0970-8
  5. Akram, A.U., et al. "An effective experts mining technique in online discussion forums," in Proc. of Computing, Electronic and Electrical Engineering (ICE Cube), 2016 International Conference on. IEEE. 2016.
  6. Günnemann, S., "Machine Learning Meets Databases," Datenbank-Spektrum, 17(1), p. 77-83, 2017. https://doi.org/10.1007/s13222-017-0247-8
  7. Jeong, H., et al., "Detection of Zombie PCs based on email spam analysis," KSII Transactions on Internet and Information Systems (TIIS), 6(5), p. 1445-1462, 2012. https://doi.org/10.3837/tiis.2012.05.011
  8. Zhuang, X., et al., "A unified score propagation model for web spam demotion algorithm," Information Retrieval Journal, p. 1-28, 2017.
  9. Rout, J.K., et al., "Deceptive review detection using labeled and unlabeled data," Multimedia Tools and Applications, 76(3), p. 3187-3211, 2017. https://doi.org/10.1007/s11042-016-3819-y
  10. Crawford, M., et al., "Survey of review spam detection using machine learning techniques," Journal of Big Data, 2(1), p. 23, 2015. https://doi.org/10.1186/s40537-015-0029-9
  11. Wang, G., et al. "Review Graph Based Online Store Review Spammer Detection," in Proc. of 2011 IEEE 11th International Conference on Data Mining. 2011.
  12. Javanmardi, S., et al., "Fr trust: a fuzzy reputation-based model for trust management in semantic p2p grids," International Journal of Grid and Utility Computing, 6(1), p. 57-66, 2014. https://doi.org/10.1504/IJGUC.2015.066397
  13. Kangale, A., et al., "Mining consumer reviews to generate ratings of different product attributes while producing feature-based review-summary," International Journal of Systems Science, 47(13), p. 3272-3286, 2016. https://doi.org/10.1080/00207721.2015.1116640
  14. Gani, A., et al., "A survey on indexing techniques for big data: taxonomy and performance evaluation," Knowledge and information systems, 46(2), p. 241-284, 2016. https://doi.org/10.1007/s10115-015-0830-y
  15. Seneviratne, S., et al., "Spam mobile apps: Characteristics, detection, and in the wild analysis," ACM Transactions on the Web (TWEB), 11(1), p. 4, 2017.
  16. Page, L., et al., "The PageRank citation ranking: bringing order to the Web," 1999.
  17. Benczur, A.A., et al. "Spamrank-fully automatic link spam detection work in progress," in Proc. of Proceedings of the first international workshop on adversarial information retrieval on the web, 2005.
  18. Li, L., et al., "Document representation and feature combination for deceptive spam review detection," Neurocomputing, 2017.
  19. Hong, S.-S., J.-H. Kong, and M.-M. Han, "The Adaptive SPAM Mail Detection System using Clustering based on Text Mining," KSII Transactions on Internet and Information Systems(TIIS), 8(6), p.2186-2196, 2014. https://doi.org/10.3837/tiis.2014.06.022
  20. Jindal, N. and B. Liu, Review spam detection, in Proceedings of the 16th international conference on World Wide Web. 2007, ACM: Banff, Alberta, Canada. p. 1189-1190, 2007.
  21. Jindal, N. and B. Liu. "Opinion spam and analysis," in Proc. of Proceedings of the 2008 International Conference on Web Search and Data Mining. ACM. 2008.
  22. Lim, E.-P., et al. "Detecting product review spammers using rating behaviors," in Proc. of Proceedings of the 19th ACM international conference on Information and knowledge management. 2010. ACM .
  23. Mukherjee, A., et al. "Detecting group review spam," in Proc. of Proceedings of the 20th international conference companion on World wide web. ACM. 2011.
  24. Algur, S.P. and J.G. Biradar. "Rating consistency and review content based multiple stores review spam detection," in Proc. of Information Processing (ICIP), 2015 International Conference on. IEEE. 2015.
  25. Lin, Y., et al., "Towards online review spam detection," in Proc. of Proceedings of the 23rd International Conference on World Wide Web, ACM: Seoul, Korea. p. 341-342, 2014.
  26. Kumar, S., et al. "A Machine Learning Based Web Spam Filtering Approach," in Proc. of Advanced Information Networking and Applications (AINA), 2016 IEEE 30th International Conference on, IEEE, 2016 .
  27. Ye, J. and L. Akoglu. "Discovering opinion spammer groups by network footprints," in Proc. of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer. 2015.
  28. Strötgen, J., O. Alonso, and M. Gertz. "Retro: Time-Based Exploration of Product Reviews," in Proc. of ECIR, Springer. 2012.
  29. Chen, Y.-R. and H.-H. Chen. "Opinion spam detection in web forum: a real case study," in Proc. of Proceedings of the 24th International Conference on World Wide Web, International World Wide Web Conferences Steering Committee, 2015.
  30. Sharma, K. and K.-I. Lin. "Review spam detector with rating consistency check," in Proc. of Proceedings of the 51st ACM Southeast Conference. ACM, 2013.
  31. Heydari, A., M. Tavakoli, and N. Salim, "Detection of fake opinions using time series," Expert Systems with Applications, 58, p. 83-92, 2016. https://doi.org/10.1016/j.eswa.2016.03.020
  32. Rayana, S. and L. Akoglu, "Collective Opinion Spam Detection: Bridging Review Networks and Metadata," in Proc. of Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM: Sydney, NSW, Australia. p. 985-994, 2015.
  33. Castillo, C., et al., "Know your neighbors: web spam detection using the web topology," in Proc. of Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, ACM: Amsterdam, The Netherlands. p. 423-430. 2007.
  34. Shehnepoor, S., et al., "NetSpam: A Network-Based Spam Detection Framework for Reviews in Online Social Media." IEEE Transactions on Information Forensics and Security, 12(7): p. 1585-1595, 2017. https://doi.org/10.1109/TIFS.2017.2675361
  35. Xue, H. and F. Li, "A Content-Aware Trust Index for Online Review Spam Detection," in Proc. of Data and Applications Security and Privacy XXXI: 31st Annual IFIP WG 11.3 Conference, DBSec 2017, Philadelphia, PA, USA, July 19-21, 2017, Proceedings, G. Livraga and S. Zhu, Editors, Springer International Publishing: Cham. p. 489-508, 2017.
  36. Mukherjee, A., et al. "What yelp fake review filter might be doing?" in Proc. of ICWSM. 2013.
  37. Heydari, A., et al., "Detection of review spam: A survey," Expert Systems with Applications, 42(7), p. 3634-3642, 2015. https://doi.org/10.1016/j.eswa.2014.12.029
  38. Esuli, A. and F. Sebastiani, "SentiWordNet: a high-coverage lexical resource for opinion mining," Evaluation, p. 1-26, 2007.
  39. Ohana, B. and B. Tierney, "Sentiment classification of reviews using SentiWordNet," 2009.
  40. Hu, X., et al. "Social spammer detection with sentiment information," in Proc. of Data Mining (ICDM), 2014 IEEE International Conference on. IEEE. 2014 .
  41. Jindal, N. and B. Liu. "Review spam detection," in Proc. of Proceedings of the 16th international conference on World Wide Web. ACM. 2007.
  42. Krishnan, V. and R. Raj. "Web spam detection with anti-trust rank," in AIRWeb. 2006.
  43. Roul, R.K., et al., "Detecting spam web pages using content and link-based techniques," Sadhana, 41(2): p. 193-202, 2016. https://doi.org/10.1007/s12046-015-0460-9
  44. Abdi, H., "The Kendall rank correlation coefficient," Encyclopedia of Measurement and Statistics. Sage, Thousand Oaks, CA, p. 508-510, 2007.
  45. Zhang, J., M.S. Ackerman, and L. Adamic. "Expertise networks in online communities: structure and algorithms," in Proc. of Proceedings of the 16th international conference on World Wide Web. ACM, 2007.
  46. Haveliwala, T.H., "Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search," IEEE transactions on knowledge and data engineering, 15(4), p. 784-796, 2003. https://doi.org/10.1109/TKDE.2003.1208999
  47. Xue, H. and F. Li. "A Content-Aware Trust Index for Online Review Spam Detection," in Proc. of IFIP Annual Conference on Data and Applications Security and Privacy. Springer. 2017.

Cited by

  1. A feature-centric spam email detection model using diverse supervised machine learning algorithms vol.38, pp.3, 2020, https://doi.org/10.1108/el-07-2019-0181
  2. Resampling imbalanced data to detect fake reviews using machine learning classifiers and textual-based features vol.80, pp.9, 2021, https://doi.org/10.1007/s11042-020-10299-5
  3. Using a hybrid content-based and behaviour-based featuring approach in a parallel environment to detect fake reviews vol.47, pp.None, 2021, https://doi.org/10.1016/j.elerap.2021.101048