Scaling factorization machines to relational data

Authors:
Steffen Rendle
Affiliations:
University of Konstanz, Konstanz, Germany
Venue:
Proceedings of the VLDB Endowment
Year:
2013

Citing 13
Cited 2

Simple Estimators for Relational Bayesian Classifiers

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Distribution-based aggregation for relational learning with identifier attributes

Machine Learning
Bayesian probabilistic matrix factorization using Markov chain Monte Carlo

Proceedings of the 25th international conference on Machine learning
Factorization meets the neighborhood: a multifaceted collaborative filtering model

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Matchbox: large scale online bayesian recommendations

Proceedings of the 18th international conference on World wide web
Feature hashing for large scale multitask learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Collaborative filtering with temporal dynamics

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Scalable Collaborative Filtering Approaches for Large Recommender Systems

The Journal of Machine Learning Research
Pairwise interaction tensor factorization for personalized tag recommendation

Proceedings of the third ACM international conference on Web search and data mining
Large linear classification when data cannot fit in memory

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast als-based matrix factorization for explicit and implicit feedback datasets

Proceedings of the fourth ACM conference on Recommender systems
Factorization Machines

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Factorization Machines with libFM

ACM Transactions on Intelligent Systems and Technology (TIST)

Predicting response in mobile advertising with hierarchical importance-aware factorization machine

Proceedings of the 7th ACM international conference on Web search and data mining
Aggregation and ordering in factorised databases

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

The most common approach in predictive modeling is to describe cases with feature vectors (aka design matrix). Many machine learning methods such as linear regression or support vector machines rely on this representation. However, when the underlying data has strong relational patterns, especially relations with high cardinality, the design matrix can get very large which can make learning and prediction slow or even infeasible. This work solves this issue by making use of repeating patterns in the design matrix which stem from the underlying relational structure of the data. It is shown how coordinate descent learning and Bayesian Markov Chain Monte Carlo inference can be scaled for linear regression and factorization machine models. Empirically, it is shown on two large scale and very competitive datasets (Netflix prize, KDDCup 2012), that (1) standard learning algorithms based on the design matrix representation cannot scale to relational predictor variables, (2) the proposed new algorithms scale and (3) the predictive quality of the proposed generic feature-based approach is as good as the best specialized models that have been tailored to the respective tasks.