A bayesian framework for learning shared and individual subspaces from multiple data sources

  • Authors:
  • Sunil Kumar Gupta;Dinh Phung;Brett Adams;Svetha Venkatesh

  • Affiliations:
  • Department of Computing, Curtin University, Perth, Australia;Department of Computing, Curtin University, Perth, Australia;Department of Computing, Curtin University, Perth, Australia;Department of Computing, Curtin University, Perth, Australia

  • Venue:
  • PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper presents a novel Bayesian formulation to exploit shared structures across multiple data sources, constructing foundations for effective mining and retrieval across disparate domains. We jointly analyze diverse data sources using a unifying piece of metadata (textual tags). We propose a method based on Bayesian Probabilistic Matrix Factorization (BPMF) which is able to explicitly model the partial knowledge common to the datasets using shared subspaces and the knowledge specific to each dataset using individual subspaces. For the proposed model, we derive an efficient algorithm for learning the joint factorization based on Gibbs sampling. The effectiveness of the model is demonstrated by social media retrieval tasks across single and multiple media. The proposed solution is applicable to a wider context, providing a formal framework suitable for exploiting individual as well as mutual knowledge present across heterogeneous data sources of many kinds.