CoBaFi: collaborative bayesian filtering

Authors:
Alex Beutel;Kenton Murray;Christos Faloutsos;Alexander J. Smola
Affiliations:
Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA
Venue:
Proceedings of the 23rd international conference on World wide web
Year:
2014

Citing 24
Cited 0

Elements of information theory

Elements of information theory
Bayesian hierarchical clustering

ICML '05 Proceedings of the 22nd international conference on Machine learning
Fast maximum margin matrix factorization for collaborative prediction

ICML '05 Proceedings of the 22nd international conference on Machine learning
Predictive discrete latent factor models for large scale dyadic data

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Opinion spam and analysis

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Approximation algorithms for co-clustering

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Bayesian probabilistic matrix factorization using Markov chain Monte Carlo

Proceedings of the 25th international conference on Machine learning
Factorization meets the neighborhood: a multifaceted collaborative filtering model

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
A Unified View of Matrix Factorization Models

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Mixed Membership Stochastic Blockmodels

The Journal of Machine Learning Research
BoltzRank: learning to maximize expected ranking gain

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Matrix Factorization Techniques for Recommender Systems

Computer
Multi-HDP: a non parametric Bayesian model for tensor factorization

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Manipulation-resistant collaborative filtering systems

Proceedings of the third ACM conference on Recommender systems
Pairwise interaction tensor factorization for personalized tag recommendation

Proceedings of the third ACM international conference on Web search and data mining
Detecting product review spammers using rating behaviors

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Collaborative competitive filtering: learning recommender using context of user choice

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Discovering geographical topics in the twitter stream

Proceedings of the 21st international conference on World Wide Web
WSABIE: scaling up to large vocabulary image annotation

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Machine Learning: A Probabilistic Perspective

Machine Learning: A Probabilistic Perspective
Learning Attitudes and Attributes from Multi-aspect Reviews

ICDM '12 Proceedings of the 2012 IEEE 12th International Conference on Data Mining
Distributed large-scale natural graph factorization

Proceedings of the 22nd international conference on World Wide Web
Instant foodie: predicting expert ratings from grassroots

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hidden factors and hidden topics: understanding rating dimensions with review text

Proceedings of the 7th ACM conference on Recommender systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given a large dataset of users' ratings of movies, what is the best model to accurately predict which movies a person will like? And how can we prevent spammers from tricking our algorithms into suggesting a bad movie? Is it possible to infer structure between movies simultaneously? In this paper we describe a unified Bayesian approach to Collaborative Filtering that accomplishes all of these goals. It models the discrete structure of ratings and is flexible to the often non-Gaussian shape of the distribution. Additionally, our method finds a co-clustering of the users and items, which improves the model's accuracy and makes the model robust to fraud. We offer three main contributions: (1) We provide a novel model and Gibbs sampling algorithm that accurately models the quirks of real world ratings, such as convex ratings distributions. (2) We provide proof of our model's robustness to spam and anomalous behavior. (3) We use several real world datasets to demonstrate the model's effectiveness in accurately predicting user's ratings, avoiding prediction skew in the face of injected spam, and finding interesting patterns in real world ratings data.