Optimal clustering in the context of overlapping cluster analysis

Authors:
Wim De Mulder
Affiliations:
Systems Research Group, University of Ghent, Ghent 9052, Belgium
Venue:
Information Sciences: an International Journal
Year:
2013

Citing 16
Cited 5

Data clustering: a review

ACM Computing Surveys (CSUR)
On Clustering Validation Techniques

Journal of Intelligent Information Systems
Cluster validation techniques for genome expression data

Signal Processing - Special issue: Genomic signal processing
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
Cluster ensemble and its applications in gene expression analysis

APBC '04 Proceedings of the second conference on Asia-Pacific bioinformatics - Volume 29
An overview of clustering methods

Intelligent Data Analysis
Weighted Cluster Ensemble Using a Kernel Consensus Function

CIARP '08 Proceedings of the 13th Iberoamerican congress on Pattern Recognition: Progress in Pattern Recognition, Image Analysis and Applications
Weighted cluster ensembles: Methods and analysis

ACM Transactions on Knowledge Discovery from Data (TKDD)
Cluster Ensemble Selection

Statistical Analysis and Data Mining
A scalable framework for cluster ensembles

Pattern Recognition
Projective Clustering Ensembles

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
A time-efficient pattern reduction algorithm for k-means clustering

Information Sciences: an International Journal
Data clustering by minimizing disconnectivity

Information Sciences: an International Journal
Measuring Similarity between Sets of Overlapping Clusters

SOCIALCOM '10 Proceedings of the 2010 IEEE Second International Conference on Social Computing
Minimum spanning tree based split-and-merge: A hierarchical clustering method

Information Sciences: an International Journal
A Cluster Separation Measure

IEEE Transactions on Pattern Analysis and Machine Intelligence

Clustering local frequency items in multiple databases

Information Sciences: an International Journal
On possibilistic clustering with repulsion constraints for imprecise data

Information Sciences: an International Journal
Uncovering overlapping cluster structures via stochastic competitive learning

Information Sciences: an International Journal
Intelligent jamming region division with machine learning and fuzzy optimization for control of robot's part micro-manipulative task

Information Sciences: an International Journal
CoBAn: A context based model for data leakage prevention

Information Sciences: an International Journal

Quantified Score

Hi-index	0.07

Visualization

Abstract

In this paper we give a general definition for the concept 'optimal clustering' which is applicable to overlapping clusterings. Overlapping clusterings are a generalization of hard clusterings and their structure is formally developed in this paper. It is generally assumed that the domain of clustering is too heuristic to develop a general, i.e. axiomatic, definition for an optimal clustering. It is shown, however, that such a definition can be given within the domain of overlapping clusterings, using the new concept of dual clustering developed in this paper. A second concept that underlies our definition of optimal clustering is the average clustering, also playing an important role in the domain of cluster ensembles. Using the general concepts discussed in this paper, it is then shown that under some conditions it is assured that the final hard clustering extracted by majority vote from a given set of clusterings, is optimal over all hard clusterings. Unlike traditional research related to validating clusterings, we do not develop a new cluster validation measure on top of the many existing ones, but rather we develop a general framework for cluster validation measures, at least within the domain of overlapping clusterings. This framework allows to develop some general theorems about clustering.