When sparsity meets noise in collaborative filtering

  • Authors:
  • Biyun Hu;Zhoujun Li;Wenhan Chao

  • Affiliations:
  • State Key Laboratory of Software Development Environment, Beihang University, China and School of Computer Science and Engineering, Beihang University, Beijing, China;State Key Laboratory of Software Development Environment, Beihang University, China and School of Computer Science and Engineering, Beihang University, Beijing, China;State Key Laboratory of Software Development Environment, Beihang University, China and School of Computer Science and Engineering, Beihang University, Beijing, China

  • Venue:
  • APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditionally, it is often assumed that data sparsity is a big problem of user-based collaborative filtering algorithm. However, the analysis is based only on data quantity without considering data quality, which is an important characteristic of data, sparse high quality data may be good for the algorithm, thus, the analysis is one-sided. In this paper, the effects of training ratings with different levels of sparsity on recommendation quality are first investigated on a real world dataset. Preliminary experimental results show that data sparsity can have positive effects on both recommendation accuracy and coverage. Next, the measurement of data noise is introduced. Then, taking data noise into consideration, the effects of data sparsity on the recommendation quality of the algorithm are re-evaluated. Experimental results show that if sparsity implies high data quality (low noise), then sparsity is good for both recommendation accuracy and coverage. This result has shown that the traditional analysis about the effect of data sparsity is one-sided, and has the implication that recommendation quality can be improved substantially by choosing high quality data.