Adaptive evidence accumulation clustering using the confidence of the objects' assignments

Authors:
João M. M. Duarte;Ana L. N. Fred;F. Jorge F. Duarte
Affiliations:
GECAD - Knowledge Engineering and Decision Support Group, Institute of Engineering, Polytechnic of Porto (ISEP/IPP), Porto, Portugal,Instituto de Telecomunicações, Instituto Superior T&# ...;Instituto de Telecomunicações, Instituto Superior Técnico, Lisboa, Portugal;GECAD - Knowledge Engineering and Decision Support Group, Institute of Engineering, Polytechnic of Porto (ISEP/IPP), Porto, Portugal
Venue:
PAKDD'12 Proceedings of the 2012 Pacific-Asia conference on Emerging Trends in Knowledge Discovery and Data Mining
Year:
2012

Citing 14
Cited 0

Evidence Accumulation Clustering Based on the K-Means Algorithm

Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Voting-Merging: An Ensemble Method for Clustering

ICANN '01 Proceedings of the International Conference on Artificial Neural Networks
A decision-theoretic generalization of on-line learning and an application to boosting

EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
Finding Consistent Clusters in Data Partitions

MCS '01 Proceedings of the Second International Workshop on Multiple Classifier Systems
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
Combining Multiple Weak Clusterings

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Ensembles of Partitions via Data Resampling

ITCC '04 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'04) Volume 2 - Volume 2
Solving cluster ensemble problems by bipartite graph partitioning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Adaptive Clustering Ensembles

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 1 - Volume 01
Combining Multiple Clusterings Using Evidence Accumulation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Moderate diversity for better cluster ensembles

Information Fusion
Fuzzy Clustering Ensemble Based on Dual Boosting

FSKD '07 Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 02
Boosting for Model-Based Data Clustering

Proceedings of the 30th DAGM symposium on Pattern Recognition
Weighted cluster ensembles: Methods and analysis

ACM Transactions on Knowledge Discovery from Data (TKDD)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Ensemble methods are known to increase the performance of learning algorithms, both on supervised and unsupervised learning. Boosting algorithms are quite successful in supervised ensemble methods. These algorithms build incrementally an ensemble of classifiers by focusing on objects previously misclassified while training the current classifier. In this paper we propose an extension to the Evidence Accumulation Clustering method inspired by the Boosting algorithms. While on supervised learning the identification of misclassified objects is a trivial task because the labels for each object are known, on unsupervised learning these are unknown, making it difficult to identify the objects on which the clustering algorithm should focus. The proposed approach uses the information contained in the co-association matrix to identify degrees of confidence of the assignments of each object to its cluster. The degree of confidence is then used to select which objects should be emphasized in the learning process of the clustering algorithm. New consensus partition validity measures, based on the notion of degree of confidence, are also proposed. In order to evaluate the performance of our approaches, experiments on several artificial and real data sets were performed and shown the adaptive clustering ensemble method and the consensus partition validity measure help to improve the quality of data clustering.