Infinite Liouville mixture models with application to text and texture categorization

  • Authors:
  • Nizar Bouguila

  • Affiliations:
  • Concordia Institute for Information Systems Engineering, Faculty of Engineering and Computer Science, Concordia University, Montreal, Qc, Canada H3G 2W1

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2012

Quantified Score

Hi-index 0.10

Visualization

Abstract

This paper addresses the problem of proportional data modeling and clustering using mixture models, a problem of great interest and of importance for many practical pattern recognition, image processing, data mining and computer vision applications. Finite mixture models are broadly applicable to clustering problems. But, they involve the challenging problem of the selection of the number of clusters which requires a certain trade-off. The number of clusters must be sufficient to provide the discriminating capability between clusters required for a given application. Indeed, if too many clusters are employed overfitting problems may occur and if few are used we have a problem of underfitting. Here we approach the problem of modeling and clustering proportional data using infinite mixtures which have been shown to be an efficient alternative to finite mixtures by overcoming the concern regarding the selection of the optimal number of mixture components. In particular, we propose and discuss the consideration of infinite Liouville mixture model whose parameter values are fitted to the data through a principled Bayesian algorithm that we have developed and which allows uncertainty in the number of mixture components. Our experimental evaluation involves two challenging applications namely text classification and texture discrimination, and suggests that the proposed approach can be an excellent choice for proportional data modeling.