Active learning for online bayesian matrix factorization

  • Authors:
  • Jorge Silva;Lawrence Carin

  • Affiliations:
  • Duke University, Durham, NC, USA;Duke University, Durham, NC, USA

  • Venue:
  • Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of large-scale online matrix completion is addressed via a Bayesian approach. The proposed method learns a factor analysis (FA) model for large matrices, based on a small number of observed matrix elements, and leverages the statistical model to actively select which new matrix entries/observations would be most informative if they could be acquired, to improve the model; the model inference and active learning are performed in an online setting. In the context of online learning, a greedy, fast and provably near-optimal algorithm is employed to sequentially maximize the mutual information between past and future observations, taking advantage of submodularity properties. Additionally, a simpler procedure, which directly uses the posterior parameters learned by the Bayesian approach, is shown to achieve slightly lower estimation quality, with far less computational effort. Inference is performed using a computationally efficient online variational Bayes (VB) procedure. Competitive results are obtained in a very large collaborative filtering problem, namely the Yahoo! Music ratings dataset.