IEEE Transactions on Software Engineering - Special issue on software reliability
Handbook of software reliability engineering
Handbook of software reliability engineering
Statistical inference and data mining
Communications of the ACM
Communications of the ACM
ACM Computing Surveys (CSUR)
Statistical Themes and Lessons for Data Mining
Data Mining and Knowledge Discovery
Classification of Fault-Prone Software Modules: Prior Probabilities,Costs, and Model Evaluation
Empirical Software Engineering
Pattern Recognition Letters - Special issue: Artificial neural networks in pattern recognition
Unsupervised Selection of a Finite Dirichlet Mixture Model: An MML-Based Approach
IEEE Transactions on Knowledge and Data Engineering
An Empirical Study of a Syntactic Complexity Family
IEEE Transactions on Software Engineering
A Hybrid Feature Extraction Selection Approach for High-Dimensional Non-Gaussian Data Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
Online clustering via finite mixtures of Dirichlet and minimum message length
Engineering Applications of Artificial Intelligence
Novel mixtures based on the dirichlet distribution: application to data and image classification
MLDM'03 Proceedings of the 3rd international conference on Machine learning and data mining in pattern recognition
On fitting finite dirichlet mixture using ECM and MML
ICAPR'05 Proceedings of the Third international conference on Advances in Pattern Recognition - Volume Part I
MML-Based approach for finite dirichlet mixture estimation and selection
MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition
Paper: Modeling by shortest data description
Automatica (Journal of IFAC)
Stochastic estimation of a mixture of normal density functions using an information criterion
IEEE Transactions on Information Theory
IEEE Transactions on Image Processing
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
Hi-index | 12.05 |
In this work we present an unsupervised algorithm for learning finite mixture models from multivariate positive data. Indeed, this kind of data appears naturally in many applications, yet it has not been adequately addressed in the past. This mixture model is based on the inverted Dirichlet distribution, which offers a good representation and modeling of positive non-Gaussian data. The proposed approach for estimating the parameters of an inverted Dirichlet mixture is based on the maximum likelihood (ML) using Newton Raphson method. We also develop an approach, based on the minimum message length (MML) criterion, to select the optimal number of clusters to represent the data using such a mixture. Experimental results are presented using artificial histograms and real data sets. The challenging problem of software modules classification is investigated within the proposed statistical framework, also.