Mixture models for learning low-dimensional roles in high-dimensional data

  • Authors:
  • Manas Somaiya;Christopher Jermaine;Sanjay Ranka

  • Affiliations:
  • University of Florida, Gainesville, FL, USA;Rice University, Houston, TX, USA;University of Florida, Gainesville, FL, USA

  • Venue:
  • Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Archived data often describe entities that participate in multiple roles. Each of these roles may influence various aspects of the data. For example, a register transaction collected at a retail store may have been initiated by a person who is a woman, a mother, an avid reader, and an action movie fan. Each of these roles can influence various aspects of the customer's purchase: the fact that the customer is a mother may greatly influence the purchase of a toddler-sized pair of pants, but have no influence on the purchase of an action-adventure novel. The fact that the customer is an action move fan and an avid reader may influence the purchase of the novel, but will have no effect on the purchase of a shirt. In this paper, we present a generic, Bayesian framework for capturing exactly this situation. In our framework, it is assumed that multiple roles exist, and each data point corresponds to an entity (such as a retail customer, or an email, or a news article) that selects various roles which compete to influence the various attributes associated with the data point. We develop robust, MCMC algorithms for learning the models under the framework.