Nonparametric localized feature selection via a dirichlet process mixture of generalized dirichlet distributions

  • Authors:
  • Wentao Fan;Nizar Bouguila

  • Affiliations:
  • Concordia Institute for Information Systems Engineering, Concordia University, QC, Canada;Concordia Institute for Information Systems Engineering, Concordia University, QC, Canada

  • Venue:
  • ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part III
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a novel Bayesian nonparametric statistical approach of simultaneous clustering and localized feature selection for unsupervised learning. The proposed model is based on a mixture of Dirichlet processes with generalized Dirichlet (GD) distributions, which can also be seen as an infinite GD mixture model. Due to the nature of Bayesian nonparametric approach, the problems of overfitting and underfitting are prevented. Moreover, the determination of the number of clusters is sidestepped by assuming an infinite number of clusters. In our approach, the model parameters and the local feature saliency are estimated simultaneously by variational inference. We report experimental results of applying our model to two challenging clustering problems involving web pages and tissue samples which contain gene expressions.