Predicting who rated what in large-scale datasets

Authors:
Yan Liu;Zhenzhen Kou
Affiliations:
IBM T. J. Watson Research Center, Yorktown Heights, NY;Carnegie Mellon University, Pittsburgh, PA
Venue:
ACM SIGKDD Explorations Newsletter - Special issue on visual analytics
Year:
2007

Citing 3
Cited 1

Computational Methods for Intelligent Information Access

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
The link prediction problem for social networks

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
A Parameterized Probabilistic Model of Network Evolution for Supervised Link Prediction

ICDM '06 Proceedings of the Sixth International Conference on Data Mining

Temporal Link Prediction Using Matrix and Tensor Factorizations

ACM Transactions on Knowledge Discovery from Data (TKDD)

Quantified Score

Hi-index	0.00

Visualization

Abstract

KDD Cup 2007 focuses on movie rating behaviors. The goal of the task "Who Rated What" is to predict whether "existing" users will review "existing" movies in the future. We cast the task as a link prediction problem and address it via a simple classification approach. Compared with other applications for link prediction, there are two major challenges in our task: (1) the huge size of the Netflix data; (2) the prediction target is complicated by many factors, such as a general decrease of interest in old movies and more tendency to review more movies by Netflix users due to the success of the internet DVD rental industries. We address the first challenge by "selective" subsampling and the second by combining information from the review scores, movie contents and graph topology effectively.