When sparsity meets noise in collaborative filtering

Authors:
Biyun Hu;Zhoujun Li;Wenhan Chao
Affiliations:
State Key Laboratory of Software Development Environment, Beihang University, China and School of Computer Science and Engineering, Beihang University, Beijing, China;State Key Laboratory of Software Development Environment, Beihang University, China and School of Computer Science and Engineering, Beihang University, Beijing, China;State Key Laboratory of Software Development Environment, Beihang University, China and School of Computer Science and Engineering, Beihang University, Beijing, China
Venue:
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Year:
2012

Citing 10
Cited 0

GroupLens: an open architecture for collaborative filtering of netnews

CSCW '94 Proceedings of the 1994 ACM conference on Computer supported cooperative work
Item-based collaborative filtering recommendation algorithms

Proceedings of the 10th international conference on World Wide Web
Collaborative filtering via gaussian probabilistic latent semantic analysis

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Evaluating collaborative filtering recommender systems

ACM Transactions on Information Systems (TOIS)
Scalable collaborative filtering using cluster-based smoothing

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Being accurate is not enough: how accuracy metrics have hurt recommender systems

CHI '06 Extended Abstracts on Human Factors in Computing Systems
The wisdom of the few: a collaborative filtering approach based on expert opinions from the web

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
A survey of collaborative filtering techniques

Advances in Artificial Intelligence
The effect of sparsity on collaborative filtering metrics

ADC '09 Proceedings of the Twentieth Australasian Conference on Australasian Database - Volume 92
User preference representation based on psychometric models

ADC '11 Proceedings of the Twenty-Second Australasian Database Conference - Volume 115

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditionally, it is often assumed that data sparsity is a big problem of user-based collaborative filtering algorithm. However, the analysis is based only on data quantity without considering data quality, which is an important characteristic of data, sparse high quality data may be good for the algorithm, thus, the analysis is one-sided. In this paper, the effects of training ratings with different levels of sparsity on recommendation quality are first investigated on a real world dataset. Preliminary experimental results show that data sparsity can have positive effects on both recommendation accuracy and coverage. Next, the measurement of data noise is introduced. Then, taking data noise into consideration, the effects of data sparsity on the recommendation quality of the algorithm are re-evaluated. Experimental results show that if sparsity implies high data quality (low noise), then sparsity is good for both recommendation accuracy and coverage. This result has shown that the traditional analysis about the effect of data sparsity is one-sided, and has the implication that recommendation quality can be improved substantially by choosing high quality data.