Method to Improve Data Sparsity Problem of Collaborative Filtering Using Latent Attribute Preference

잠재적 속성 선호도를 이용한 협업 필터링의 데이터 희소성 문제 개선 방법

  • Kwon, Hyeong-Joon (School of Information and Communication Engineering, Sungkyunkwan University) ;
  • Hong, Kwang-Seok (School of Information and Communication Engineering, Sungkyunkwan University)
  • Received : 2013.01.25
  • Accepted : 2013.07.19
  • Published : 2013.10.31


In this paper, we propose the LAR_CF, latent attribute rating-based collaborative filtering, that is robust to data sparsity problem which is one of traditional problems caused of decreasing rating prediction accuracy. As compared with that existing collaborative filtering method uses a preference rating rated by users as feature vector to calculate similarity between objects, the proposed method improves data sparsity problem using unique attributes of two target objects with existing explicit preference. We consider MovieLens 100k dataset and its item attributes to evaluate the LAR_CF. As a result of artificial data sparsity and full-rating experiments, we confirmed that rating prediction accuracy can be improved rating prediction accuracy in data sparsity condition by the LAR_CF.

본 논문에서는 협업 필터링의 선호도 예측 정확성의 저하를 초래하는 전통적 문제점 중 하나인 데이터 희소성 문제에 강인한 잠재적 속성 선호도 기반 협업 필터링 방법(Latent Attribute Rating-based Collaborative Filtering, LAR_CF)을 제안한다. 기존의 협업 필터링은 객체의 유사성을 판단하기 위한 특징벡터로써 사용자가 명시적으로 평가한 선호도만을 이용하며, 해당 문제 개선을 위해 속성을 사용하는 연구들은 범용적으로 사용하기 어려웠다. 이웃 기반 필터링에 근본을 두는 LAR_CF는 기존의 명시적 선호도와 함께 유사도 평가의 대상이 되는 두 객체의 고유한 속성을 특징벡터로 삼기 때문에 명시적 선호도의 수가 적어서 발생하는 데이터 희소성 문제를 개선하여 선호도 예측 정확도를 향상시키며, 속성의 종류에 구애받지 않고 손쉽게 적용할 수 있는 장점을 가진다. LAR_CF의 유효성 평가를 위해서 MovieLens 100k 데이터세트 및 해당 데이터세트에 사용된 속성정보를 활용하여 일반적 성능 실험과 인공적 데이터 희소성 실험에서 선호도 예측 정확도를 평가한 결과, 제안하는 방법이 데이터 희소 조건에서 선호도 예측 정확도를 향상시킬 수 있음을 확인하였다.



Supported by : 한국연구재단


  1. Joseph A. Konstan and Jhon Riedl, "Deconstructing Recommender Systems", IEEE Spectrum, October 2012.
  2. Adomavicius G. and Tuzhilin, A., "Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions", IEEE Trans. Know. and Data Eng., Vol. 17 No. 6, pp. 734-749, 2005.
  3. Jonathan L. Herlocker, Joseph A. Konstan, Al Borchers and Jhon Riedl, "An Algorithmic Framework for Performing Collaborative Filtering" ACM SIGIR 22nd Int. Conf. Research and Development in Information Retrieval, pp. 230-237, 1999.
  4. D. Goldberg, D. Nichols, B. M. Oki, and D. Terry, "Using Collaborative Filtering to Weave an Information Tapestry", Communications of the ACM, Vol. 35, No. 12, pp. 61-70 1992.
  5. John S. Breese, David Heckerman and Carl Kadie, "Empirical Analysis of Predictive Algorithms for Collaborative Filtering", Proc. the 14th Conf. Uncertainty in Artificial Intelligence, pp. 43-52, 1998.
  6. K. Goldberg, T. Roeder, D. Gupta and C. Perkins, "Eigentaste: A Constant Time Collaborative Filtering Algorithm", Information Retrieval, Vol. 4, No. 2, pp. 133-151, 2001.
  7. H. J. Kwon and K. S. Hong, "Personalized Smart TV Program Recommender Based on Collaborative Filtering and a Novel Similarity Method", IEEE Trans. Consum. Electron., Vol. 57, No. 3, pp. 1416-1423, 2011.
  8. H. J. Ahn, "A New Similarity Measure for Collaborative Filtering to Alleviate the New User Cold-starting Problem", Information Sciences, Vol. 178, No. 1, pp. 37-51, 2008.
  9. D. Lemire and A. Maclachlan, "Slope One Predictors for Online Rating-Based Collaborative Filtering", Proc. 5th SIAM Int. Conf. Data Mining, pp. 471-475, 2005.
  10. Yu Li, Liu Lu and Li Xuefeng, "A Hybrid Collaborative Filtering Method for Multiple-interests and Multiple-content Recommendation in E-Commerce", Expert Systems with Applications, Vol. 28, No. 1, pp. 67-77, 2005.
  11. Buhwan Jeong, Jaewook Lee and Hyunbo Cho, "Improving Memory-based Collaborative Filtering via Similarity Updating and Prediction Modulation", Information Sciences, Vol. 18, No. 5, pp. 602-612, 2010.
  12. Jun Wang, Arjen P. de Vries and Marcel J. T. Reinders, "Unifying User-based and Item-based Collaborative Filtering Approaches by Similarity Fusion", Proc. 29th ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 501-508, 2006.
  13. Bin Cho, Jian-Tao Sun, Jianmin Wu, Qiang Yang and Zheng Chen, "Learning Bidirectional Similarity for Collaborative Filtering", LNCS 5211, pp. 178-194, 2008.
  14. T. H. Kim and S. B. Yang, "An Improved Neighbor Selection Algorithm in Collaborative Filtering", IEICE Trans. Inform. and Syst., Vol. E88-D, No. 5, pp. 1072-1076, 2005.
  15. Souvik Debnath, Niloy Ganguly and Pabitra Mitra, "Feature Weighting in Content based Recommendation System Using Social Network Analysis", Proc. of the 17th Int. Conf. on World Wide Web, pp. 1041-1042, 2008.
  16. Karen H. L. Tso-Sutter, Leonardo Balby Marinho and Lars Schmidt-Thieme, "Tag-aware Recommender Systems by Fusion of Collaborative Filtering Algorithms", Proc. of the 2008 ACM Symposium on Applied computing, pp. 1995-1999, 2008.
  17. Andrew I. Schein, Alexandrin Popescul, Lyle H. Ungar and David M. Pennock, "Methods and Metrics for Cold-start Recommendations", Proc. of the 25th Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 253-260, 2002.
  18. Huang, Z., Chen, H. and Zeng, D. "Applying Associative Retrieval Techniques to Alleviate the Sparsity Problem in Collaborative Filtering", ACM Transactions on Information Systems, Vol. 22, No. 1, pp. 116-142, 2004.
  19. B. N. Miller, I. Albert, S.K. Lam, J.A. Konstan, J. T. Riedl, "MovieLens Unplugged: Experiences with an Occasionally Connected Recommender System on Four Mobile Devices", Proc. of the 2003 Int. Conf. Intelligent User Interfaces, pp. 263-266, 2003.
  20., MovieLens Data Sets, GroupLens Research in Department of Computer Science and Engineering at the University of Minnesota.