Training and testing of recommender systems on data missing not at random

Authors:
Harald Steck
Affiliations:
Bell Labs, Alcatel-Lucent, Murray Hill, NJ, USA
Venue:
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2010

Citing 8
Cited 22

Statistical analysis with missing data

Statistical analysis with missing data
A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems

Machine Learning
Restricted Boltzmann machines for collaborative filtering

Proceedings of the 24th international conference on Machine learning
KDD Cup 2007 task 1 winner report

ACM SIGKDD Explorations Newsletter - Special issue on visual analytics
Factorization meets the neighborhood: a multifaceted collaborative filtering model

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Hinge Rank Loss and the Area Under the ROC Curve

ECML '07 Proceedings of the 18th European conference on Machine Learning
Collaborative Filtering for Implicit Feedback Datasets

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Collaborative prediction and ranking with non-random missing data

Proceedings of the third ACM conference on Recommender systems

Wisdom of the better few: cold start recommendation via representative based rating elicitation

Proceedings of the fifth ACM conference on Recommender systems
Random walk based entity ranking on graph for multidimensional recommendation

Proceedings of the fifth ACM conference on Recommender systems
Item popularity and recommendation accuracy

Proceedings of the fifth ACM conference on Recommender systems
Multi-value probabilistic matrix factorization for IP-TV recommendations

Proceedings of the fifth ACM conference on Recommender systems
A generic graph-based multidimensional recommendation framework and its implementations

Proceedings of the 21st international conference companion on World Wide Web
Optimizing for Video Storage Networking With Recommender Systems

Bell Labs Technical Journal
Accelerated singular value thresholding for matrix completion

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
On top-k recommendation using social networks

Proceedings of the sixth ACM conference on Recommender systems
Ranking with non-random missing ratings: influence of popularity and positivity on evaluation metrics

Proceedings of the sixth ACM conference on Recommender systems
A simple unsupervised latent semantics based approach for sentence similarity

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Modeling sentences in the latent space

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Social temporal collaborative ranking for context aware movie recommendation

ACM Transactions on Intelligent Systems and Technology (TIST) - Special section on twitter and microblogging services, social recommender systems, and CAMRa2010: Movie recommendation in context
Serendipitous Personalized Ranking for Top-N Recommendation

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Mining large streams of user data for personalized recommendations

ACM SIGKDD Explorations Newsletter
Evaluation of recommendations: rating-prediction and ranking

Proceedings of the 7th ACM conference on Recommender systems
Xbox movies recommendations: variational bayes matrix factorization with embedded feature selection

Proceedings of the 7th ACM conference on Recommender systems
Retargeted matrix factorization for collaborative filtering

Proceedings of the 7th ACM conference on Recommender systems
Towards scalable and accurate item-oriented recommendations

Proceedings of the 7th ACM conference on Recommender systems
On the use of decentralization to enable privacy in web-scale recommendation services

Proceedings of the 12th ACM workshop on Workshop on privacy in the electronic society
Nearest neighbour based social recommendation using heat diffusion

Proceedings of the 6th ACM India Computing Convention
Partial imputation of unseen records to improve classification using a hybrid multi-layered artificial immune system and genetic algorithm

Applied Soft Computing
A survey of collaborative filtering based social recommender systems

Computer Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Users typically rate only a small fraction of all available items. We show that the absence of ratings carries useful information for improving the top-k hit rate concerning all items, a natural accuracy measure for recommendations. As to test recommender systems, we present two performance measures that can be estimated, under mild assumptions, without bias from data even when ratings are missing not at random (MNAR). As to achieve optimal test results, we present appropriate surrogate objective functions for efficient training on MNAR data. Their main property is to account for all ratings - whether observed or missing in the data. Concerning the top-k hit rate on test data, our experiments indicate dramatic improvements over even sophisticated methods that are optimized on observed ratings only.