DOI QR코드

DOI QR Code

Predictive Clustering-based Collaborative Filtering Technique for Performance-Stability of Recommendation System

추천 시스템의 성능 안정성을 위한 예측적 군집화 기반 협업 필터링 기법

  • Lee, O-Joun (School of Computer Engineering, Chung-Ang University) ;
  • You, Eun-Soon (Institute of Media Content, Dankook University)
  • 이오준 (중앙대학교 컴퓨터공학과) ;
  • 유은순 (단국대학교 미디어콘텐츠연구원)
  • Received : 2015.02.10
  • Accepted : 2015.03.20
  • Published : 2015.03.31

Abstract

With the explosive growth in the volume of information, Internet users are experiencing considerable difficulties in obtaining necessary information online. Against this backdrop, ever-greater importance is being placed on a recommender system that provides information catered to user preferences and tastes in an attempt to address issues associated with information overload. To this end, a number of techniques have been proposed, including content-based filtering (CBF), demographic filtering (DF) and collaborative filtering (CF). Among them, CBF and DF require external information and thus cannot be applied to a variety of domains. CF, on the other hand, is widely used since it is relatively free from the domain constraint. The CF technique is broadly classified into memory-based CF, model-based CF and hybrid CF. Model-based CF addresses the drawbacks of CF by considering the Bayesian model, clustering model or dependency network model. This filtering technique not only improves the sparsity and scalability issues but also boosts predictive performance. However, it involves expensive model-building and results in a tradeoff between performance and scalability. Such tradeoff is attributed to reduced coverage, which is a type of sparsity issues. In addition, expensive model-building may lead to performance instability since changes in the domain environment cannot be immediately incorporated into the model due to high costs involved. Cumulative changes in the domain environment that have failed to be reflected eventually undermine system performance. This study incorporates the Markov model of transition probabilities and the concept of fuzzy clustering with CBCF to propose predictive clustering-based CF (PCCF) that solves the issues of reduced coverage and of unstable performance. The method improves performance instability by tracking the changes in user preferences and bridging the gap between the static model and dynamic users. Furthermore, the issue of reduced coverage also improves by expanding the coverage based on transition probabilities and clustering probabilities. The proposed method consists of four processes. First, user preferences are normalized in preference clustering. Second, changes in user preferences are detected from review score entries during preference transition detection. Third, user propensities are normalized using patterns of changes (propensities) in user preferences in propensity clustering. Lastly, the preference prediction model is developed to predict user preferences for items during preference prediction. The proposed method has been validated by testing the robustness of performance instability and scalability-performance tradeoff. The initial test compared and analyzed the performance of individual recommender systems each enabled by IBCF, CBCF, ICFEC and PCCF under an environment where data sparsity had been minimized. The following test adjusted the optimal number of clusters in CBCF, ICFEC and PCCF for a comparative analysis of subsequent changes in the system performance. The test results revealed that the suggested method produced insignificant improvement in performance in comparison with the existing techniques. In addition, it failed to achieve significant improvement in the standard deviation that indicates the degree of data fluctuation. Notwithstanding, it resulted in marked improvement over the existing techniques in terms of range that indicates the level of performance fluctuation. The level of performance fluctuation before and after the model generation improved by 51.31% in the initial test. Then in the following test, there has been 36.05% improvement in the level of performance fluctuation driven by the changes in the number of clusters. This signifies that the proposed method, despite the slight performance improvement, clearly offers better performance stability compared to the existing techniques. Further research on this study will be directed toward enhancing the recommendation performance that failed to demonstrate significant improvement over the existing techniques. The future research will consider the introduction of a high-dimensional parameter-free clustering algorithm or deep learning-based model in order to improve performance in recommendations.

사용자의 취향과 선호도를 고려하여 정보를 제공하는 추천 시스템의 중요성이 높아졌다. 이를 위해 다양한 기법들이 제안되었는데, 비교적 도메인의 제약이 적은 협업 필터링이 널리 사용되고 있다. 협업 필터링의 한 종류인 모델 기반 협업 필터링은 기계학습이나 데이터 마이닝 모델을 협업 필터링에 접목한 방법이다. 이는 희박성 문제와 확장성 문제 등의 협업 필터링의 근본적인 한계를 개선하지만, 모델 생성 비용이 높고 성능/확장성 트레이드오프가 발생한다는 한계점을 갖는다. 성능/확장성 트레이드오프는 희박성 문제의 일종인 적용범위 감소 문제를 발생시킨다. 또한, 높은 모델 생성 비용은 도메인 환경 변화의 누적으로 인한 성능 불안정의 원인이 된다. 본 연구에서는 이 문제를 해결하기 위해, 군집화 기반 협업 필터링에 마르코프 전이확률모델과 퍼지 군집화의 개념을 접목하여, 적용범위 감소 문제와 성능 불안정성 문제를 해결한 예측적 군집화 기반 협업 필터링 기법을 제안한다. 이 기법은 첫째, 사용자 기호(Preference)의 변화를 추적하여 정적인 모델과 동적인 사용자간의 괴리 해소를 통해 성능 불안정 문제를 개선한다. 둘째, 전이확률과 군집 소속 확률에 기반한 적용범위 확장으로 적용범위 감소 문제를 개선한다. 제안하는 기법의 검증은 각각 성능 불안정성 문제와 확장성/성능 트레이드오프 문제에 대한 강건성(robustness)시험을 통해 이뤄졌다. 제안하는 기법은 기존 기법들에 비해 성능의 향상 폭은 미미하다. 또한 데이터의 변동 정도를 나타내는 지표인 표준 편차의 측면에서도 의미 있는 개선을 보이지 못하였다. 하지만, 성능의 변동 폭을 나타내는 범위의 측면에서는 기존 기법들에 비해 개선을 보였다. 첫 번째 실험에서는 모델 생성 전후의 성능 변동폭에서 51.31%의 개선을, 두 번째 실험에서는 군집 수 변화에 따른 성능 변동폭에서 36.05%의 개선을 보였다. 이는 제안하는 기법이 성능의 향상을 보여주지는 못하지만, 성능 안정성의 측면에서는 기존의 기법들을 개선하고 있음을 의미한다.

Keywords

References

  1. Ali, K. and W. Van Stam, "Tivo: Making show recommendations using a distributed collaborative filtering architecture," Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, (2004), 394-401.
  2. Bellogin, A. and J. Parapar, "Using graph partitioning techniques for neighbor selection in user-based collaborative filtering," Proceedings of the sixth ACM conference on Recommender systems, ACM, (2012), 213-216.
  3. Bennet, J. and S. Lanning, "The netflix prize," Proceedings of KDD Cup and Workshop, (2007). Available at http://www.netflixprize.com/ (Accessed 20 March, 2015).
  4. Bhosale, N. S. and S. S. Pande. "A Survey on Recommendation System for Big Data Applications," Data Mining and Knowledge Engineering, Vol.7, No.1(2015), 42-44.
  5. Bobadilla, J., F. Ortega, A. Hernando, and A. Gutierrez, "Recommender systems survey," Knowledge-Based Systems, Vol. 46(2013), 109-132.
  6. Cho, Y.-B., and Y.-H. Cho, "Considering Customer Buying Sequences to Enhance the Quality of Collaborative Filtering," Journal of Intelligence and Information Systems, Vol.13, No.2(2007), 69-80
  7. Das, A. S., M. Datar, A. Garg, A., and S. Rajaram, "Google news personalization: Scalable online collaborative filtering," Proceedings of the 16th international conference on World Wide Web, ACM, (2003), 271-280.
  8. George, T., and S. Merugu, "A scalable collaborative filtering framework based on co-clustering," Proceedings of the Fifth IEEE International Conference on Data Mining, IEEE, (2005), 4.
  9. Gong, S., "A collaborative filtering recommendation algorithm based on user clustering and item clustering," Journal of Software, Vol.5, No.7 (2010), 745-752.
  10. Hameed, M. A., O. A. Jadaan, and S. Ramachandram, "Collaborative Filtering Based Recommendation System: A survey," International Journal on Computer Science & Engineering, Vol. 4, No.5(2012).
  11. Im, I. and B. H. Kim, "The Effect of the Personalized Settings for CF-Based Recommender Systems," Journal of Intelligence and Information Systems, Vol.18, No.2(2012), 131-141. https://doi.org/10.13088/JIIS.2012.18.2.131
  12. Joshi, R. C. and R. S. Paswan, "A Survey Paper on Clustering-based Collaborative Filtering Approach to Generate Recommendations," International Journal of Science and Research, Vol.4, No.1(2015), 1395-1398.
  13. Khoshneshin, M. and W. N. Street, "Incremental collaborative filtering via evolutionary coclustering," Proceedings of the fourth ACM conference on Recommender systems, ACM, (2010), 325-328.
  14. Lee, J., M. Sun, and G. Lebanon, "A comparative study of collaborative filtering algorithms," arXiv preprint arXiv:1205.3193, (2012), 1-27.
  15. Lee, O.-J., M.-S. Hong, W.-j. Lee, and J.-D. Lee, "Scalable Collaborative Filtering Technique based on Adaptive Clustering," Journal of Intelligence and Information Systems, Vol.20, No.2(2014), 73-92. https://doi.org/10.13088/jiis.2014.20.2.073
  16. Lee, O.-J. and Y.-t. Baek, "Hybrid Preference Prediction Technique Using Weighting based Data Reliability for Collaborative Filtering Recommendation System," Journal of the Korea Society of Computer and Information, Vol.19, No.5 (2014), 61-69. https://doi.org/10.9708/jksci.2014.19.5.061
  17. Renaud-Deputter, S., T. Xiong, and S. Wang, "Combining collaborative filtering and clustering for implicit recommender system," Proceedings of 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA), IEEE, (2013), 748-755.
  18. Li, Q. and Z. Dong, "Research of collaborative filtering algorithm based on the probabilistic clustering model," Proceedings of 2010 5th International Conference on Computer Science and Education (ICCSE), IEEE, (2010), 380-383.
  19. Li, X. and T. Murata, "Using multidimensional clustering based collaborative filtering approach improving recommendation diversity," Proceedings of 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology (WI-IAT), IEEE, Vol. 3(2012), 169-174.
  20. Linden, G., B. Smith, and J. York, "Amazon.com recommendations: Item-to-item collaborative filtering," IEEE Internet Computing, (2003), 76-80.
  21. Natarajan, N., D. Shin, and I. S. Dhillon, "Which app will you use next?: Collaborative filtering with interactional context," Proceedings of the 7th ACM conference on Recommender systems, ACM, (2013), 201-208.
  22. Park, S. T. and D. M. Pennock, "Applying collaborative filtering techniques to movie search for better ranking and browsing," Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, (2007), 550-559.
  23. Pham, M. C., Y. Cao, R. Klamma, and M. Jarke, "A Clustering Approach for Collaborative Filtering Recommendation Using Social Network Analysis," J. UCS, Vol.17, No.4 (2011), 583-604.
  24. Su, X. and T. M. Khoshgoftaar, "A survey of collaborative filtering techniques," Advances in artificial intelligence, (2009), 4.
  25. Tseng, K. C., C. S. Hwang, and Y. C. Su, "Using Cloud Model for Default Voting in Collaborative Filtering," Journal of Convergence Information Technology (JCIT) Vol.6, No.12 (2011), 68-74 https://doi.org/10.4156/jcit.vol6.issue12.9
  26. Wen, J. and W. Zhou, "An improved item-based collaborative filtering algorithm based on clustering method," Journal of Computational Information Systems, Vol.8, No.2(2012), 571-578.
  27. Zhirao, J., "Based on Java Technology System and Implement the Personalized Recommendations of the system," Jilin: Jilin University, 2011.
  28. Zhou, Z., M. Sellami, W. Gaaloul, M. Barhamgi, and B. Defude, "Data providing services clustering and management for facilitating service discovery and replacement," IEEE Transactions on Automation Science and Engineering, Vol. 10, No. 4(2013), 1131-1146. https://doi.org/10.1109/TASE.2012.2237551

Cited by

  1. 온라인 음악 콘텐츠 추천 시스템 구현을 위한 협업 필터링 기법들의 비교 평가 vol.66, pp.7, 2015, https://doi.org/10.5370/kiee.2017.66.7.1083
  2. Spark를 이용한 항목 추천 기법에 관한 연구 vol.22, pp.5, 2015, https://doi.org/10.6109/jkiice.22018.22.4.715
  3. 핀테크 이상거래 탐지를 위한 적응형 프레임워크 vol.24, pp.7, 2015, https://doi.org/10.5626/ktcp.2018.24.7.337