A fully Bayesian model based on reversible jump MCMC and finite Beta mixtures for clustering

  • Authors:
  • Nizar Bouguila;Tarek Elguebaly

  • Affiliations:
  • Concordia Institute for Information Systems Engineering, Faculty of Engineering and Computer Science, Concordia University, Montreal, Qc, Canada H3G 2W1;Concordia Institute for Information Systems Engineering, Faculty of Engineering and Computer Science, Concordia University, Montreal, Qc, Canada H3G 2W1

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2012

Quantified Score

Hi-index 12.05

Visualization

Abstract

The use of mixture models in image and signal processing has proved to be of considerable interest in terms of both theoretical development and in their usefulness in several applications. Researchers have approached the mixture estimation and selection problem, to model complex datasets, with different techniques in the last few years. In theory, it is well-known that full Bayesian approaches, to handle this problem, are fully optimal. The Bayesian learning allows the incorporation of prior knowledge in a formal coherent way that avoids overfitting problems. In this paper, we propose a fully Bayesian approach for finite Beta mixtures learning using a reversible jump Markov chain Monte Carlo (RJMCMC) technique which simultaneously allows cluster assignments, parameters estimation, and the selection of the optimal number of clusters. The adverb ''fully'' is justified by the fact that all parameters of interest in our model including number of clusters and missing values are considered as random variables for which priors are specified and posteriors are approximated using RJMCMC. Our work is motivated by the fact that Beta mixtures are able to fit any unknown distributional shape and then can be considered as a useful class of flexible models to address several problems and applications involving measurements and features having well-known marked deviation from the Gaussian shape. The usefulness of the proposed approach is confirmed using synthetic mixture data, real data, and through an interesting application namely texture classification and retrieval.