A Statistical Model for Histogram Refinement
ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
A discrete mixture-based kernel for SVMs: Application to spam and image categorization
Information Processing and Management: an International Journal
Discrete visual features modeling via leave-one-out likelihood estimation and applications
Journal of Visual Communication and Image Representation
A coarse-to-fine framework to efficiently thwart plagiarism
Pattern Recognition
A Liouville-based approach for discrete data categorization
RSFDGrC'11 Proceedings of the 13th international conference on Rough sets, fuzzy sets, data mining and granular computing
Deriving kernels from generalized Dirichlet mixture models and applications
Information Processing and Management: an International Journal
Hi-index | 0.00 |
In this paper we examine the problem of count data clustering. We analyze this problem using finite mixtures of distributions. The multinomial and the multinomial Dirichlet distributions are widely accepted to model count data. We show that these two distributions cannot be the best choice in all the applications and we propose another model called the multinomial generalized Dirichlet distribution (MGDD) that is the composition of the generalized Dirichlet distribution and the multinomial, in the same way that the multinomial Dirichlet distribution (MDD) is the composition of the Dirichlet and the multinomial. The estimation of the parameters and the determination of the number of components in our model are based on the deterministic annealing expectation-maximization (DAEM) approach and the minimum description length (MDL) criterion, respectively. We compare our method to standard approaches such as multinomial and multinomial Dirichlet mixtures to show its merits. The comparison involves different applications such as spatial color image databases indexing, handwritten digit recognition, and text documents clustering.