Learning Pairwise Similarity for Data Clustering

Authors:
Ana L. N. Fred;Anil K. Jain
Affiliations:
Instituto Superior Tecnico Lisbon, Portugal;Michigan State University East Lansing, USA
Venue:
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 01
Year:
2006

Citing 0
Cited 7

Learning multiple nonredundant clusterings

ACM Transactions on Knowledge Discovery from Data (TKDD)
Pairwise probabilistic clustering using evidence accumulation

SSPR&SPR'10 Proceedings of the 2010 joint IAPR international conference on Structural, syntactic, and statistical pattern recognition
A metric to evaluate a cluster by eliminating effect of complement cluster

KI'11 Proceedings of the 34th Annual German conference on Advances in artificial intelligence
A new asymmetric criterion for cluster validation

CIARP'11 Proceedings of the 16th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
A max metric to evaluate a cluster

HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part I
A clustering ensemble based on a modified normalized mutual information metric

AMT'12 Proceedings of the 8th international conference on Active Media Technology
Pairwise similarity for cluster ensemble problem: link-based and approximate approaches

Transactions on Large-Scale Data- and Knowledge-centered systems IX

Quantified Score

Hi-index	0.00

Visualization

Abstract

Each clustering algorithm induces a similarity between given data points, according to the underlying clustering criteria. Given the large number of available clustering techniques, one is faced with the following questions: (a) Which measure of similarity should be used in a given clustering problem? (b) Should the same similarity measure be used throughout the d-dimensional feature space? In other words, are the underlying clusters in given data of similar shape? Our goal is to learn the pairwise similarity between points in order to facilitate a proper partitioning of the data without the a priori knowledge of k, the number of clusters, and of the shape of these clusters. We explore a clustering ensemble approach combined with cluster stability criteria to selectively learn the similarity from a collection of different clustering algorithms with various parameter configurations.