The PDG-Mixture Model for Clustering

Authors:
M. Julia Flores;José A. Gámez;Jens D. Nielsen
Affiliations:
Computing Systems Dept. & SIMD Lab in I3A, University of Castilla-La Mancha, Albacete, Spain;Computing Systems Dept. & SIMD Lab in I3A, University of Castilla-La Mancha, Albacete, Spain;Computing Systems Dept. & SIMD Lab in I3A, University of Castilla-La Mancha, Albacete, Spain
Venue:
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Year:
2009

Citing 9
Cited 1

Importance sampling in Bayesian networks using probability trees

Computational Statistics & Data Analysis
Bayesian Networks and Decision Graphs

Bayesian Networks and Decision Graphs
On the Representation of Probabilities over Structured Domains

CAV '99 Proceedings of the 11th International Conference on Computer Aided Verification
Probabilistic decision graphs-combining verification and AI techniques for probabilistic inference

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems - New trends in probabilistic graphical models
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Mining the ESROM: A study of breeding value classification in Manchego sheep by means of attribute selection and construction

Computers and Electronics in Agriculture
Supervised classification using probabilistic decision graphs

Computational Statistics & Data Analysis
Learning probabilistic decision graphs

International Journal of Approximate Reasoning

Modelling and inference with Conditional Gaussian Probabilistic Decision Graphs

International Journal of Approximate Reasoning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Within data mining, clustering can be considered the most important unsupervised learning problem which deals with finding a structure in a collection of unlabeled data. Generally, clustering refers to the process of organizing objects into groups whose members are similar . Among clustering approaches, those methods based on probabilistic models have been extensively developed, such as Naïve Bayes (NB) with a latent class (cluster identifier) found via an EM algorithm. Probabilistic Decision Graphs (PDGs) are a class of graphical models that can naturally encode some context specific independencies that cannot always be efficiently captured by other commonly used models. In this paper we propose to use a mixture of PDG models in cluster discovery, and an algorithm for automatic induction of the mixture and the models is introduced. The proposed approach was experimentally evaluated on both synthetic and real-world databases, and the presentation of the results includes a comparison with related techniques. The comparison demonstrates competitive performance of the mixture of PDG models with respect to likelihood. Also, the mixture of PDG models have a tendency to use fewer models (clusters) to represent domains where other models use large amounts of clusters.