Theoretical Analysis of Bayesian Matrix Factorization

Authors:
Shinichi Nakajima;Masashi Sugiyama
Affiliations:
-;-
Venue:
The Journal of Machine Learning Research
Year:
2011

Citing 23
Cited 2

Bayesian interpolation

Neural Computation
A sequence of improvements over the James-Stein estimator

Journal of Multivariate Analysis
GroupLens: applying collaborative filtering to Usenet news

Communications of the ACM
Bioinformatics: the machine learning approach

Bioinformatics: the machine learning approach
Bayesian Learning for Neural Networks

Bayesian Learning for Neural Networks
Singularities in mixture models and upper bounds of stochastic complexity

Neural Networks
A well-conditioned estimator for large-dimensional covariance matrices

Journal of Multivariate Analysis
Information Theory, Inference & Learning Algorithms

Information Theory, Inference & Learning Algorithms
Fast maximum margin matrix factorization for collaborative prediction

ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning Gaussian processes from multiple tasks

ICML '05 Proceedings of the 22nd international conference on Machine learning
Algebraic Analysis for Nonidentifiable Learning Machines

Neural Computation
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Variational Bayes Solution of Linear Neural Networks and Its Generalization Performance

Neural Computation
Stochastic Complexities of Gaussian Mixtures in Variational Bayesian Approximation

The Journal of Machine Learning Research
Principal Component Analysis for Large Scale Problems with Lots of Missing Values

ECML '07 Proceedings of the 18th European conference on Machine Learning
Dynamic Exponential Family Matrix Factorization

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Algebraic Geometry and Statistical Learning Theory

Algebraic Geometry and Statistical Learning Theory
Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation

Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation
A Singular Value Thresholding Algorithm for Matrix Completion

SIAM Journal on Optimization
Inferring parameters and structure of latent variable models by variational bayes

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Overview and recent advances in partial least squares

SLSFS'05 Proceedings of the 2005 international conference on Subspace, Latent Structure and Feature Selection
Bayesian Tensor Approach for 3-D Face Modeling

IEEE Transactions on Circuits and Systems for Video Technology
Learning in linear neural networks: a survey

IEEE Transactions on Neural Networks

Global analytic solution of fully-observed variational Bayesian matrix factorization

The Journal of Machine Learning Research
Nonparametric bayesian multitask collaborative filtering

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, variational Bayesian (VB) techniques have been applied to probabilistic matrix factorization and shown to perform very well in experiments. In this paper, we theoretically elucidate properties of the VB matrix factorization (VBMF) method. Through finite-sample analysis of the VBMF estimator, we show that two types of shrinkage factors exist in the VBMF estimator: the positive-part James-Stein (PJS) shrinkage and the trace-norm shrinkage, both acting on each singular component separately for producing low-rank solutions. The trace-norm shrinkage is simply induced by non-flat prior information, similarly to the maximum a posteriori (MAP) approach. Thus, no trace-norm shrinkage remains when priors are non-informative. On the other hand, we show a counter-intuitive fact that the PJS shrinkage factor is kept activated even with flat priors. This is shown to be induced by the non-identifiability of the matrix factorization model, that is, the mapping between the target matrix and factorized matrices is not one-to-one. We call this model-induced regularization. We further extend our analysis to empirical Bayes scenarios where hyperparameters are also learned based on the VB free energy. Throughout the paper, we assume no missing entry in the observed matrix, and therefore collaborative filtering is out of scope.