How to quantitatively compare data dissimilarities for unsupervised machine learning?

Authors:
Bassam Mokbel;Sebastian Gross;Markus Lux;Niels Pinkwart;Barbara Hammer
Affiliations:
CITEC Centre of Excellence, Bielefeld University, Germany;Computer Science Institute, Clausthal University of Technology, Germany;CITEC Centre of Excellence, Bielefeld University, Germany;Computer Science Institute, Clausthal University of Technology, Germany;CITEC Centre of Excellence, Bielefeld University, Germany
Venue:
ANNPR'12 Proceedings of the 5th INNS IAPR TC 3 GIRPR conference on Artificial Neural Networks in Pattern Recognition
Year:
2012

Citing 19
Cited 0

Kernel Neural Gas Algorithms with Application to Cluster Analysis

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 4 - Volume 04
Universal Approximation Capability of Cascade Correlation for Structures

Neural Computation
On the equivalence between kernel self-organising maps and self-organising mixture density networks

Neural Networks - 2006 Special issue: Advances in self-organizing maps--WSOM'05
The Dissimilarity Representation for Pattern Recognition: Foundations And Applications (Machine Perception and Artificial Intelligence)

The Dissimilarity Representation for Pattern Recognition: Foundations And Applications (Machine Perception and Artificial Intelligence)
Edit distance-based kernel functions for structural pattern classification

Pattern Recognition
Quality assessment of dimensionality reduction: Rank-based criteria

Neurocomputing
Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction
Similarity-based Classification: Concepts and Algorithms

The Journal of Machine Learning Research
Computational capabilities of graph neural networks

IEEE Transactions on Neural Networks
Information Retrieval Perspective to Nonlinear Dimensionality Reduction for Data Visualization

The Journal of Machine Learning Research
Comparing dissimilarity measures for content-based image retrieval

AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Scale-independent quality criteria for dimensionality reduction

Pattern Recognition Letters
Topographic mapping of large dissimilarity data sets

Neural Computation
Relational generative topographic mapping

Neurocomputing
Consistency of functional learning methods based on derivatives

Pattern Recognition Letters
White box classification of dissimilarity data

HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part I
Clustering by compression

IEEE Transactions on Information Theory
A general framework for adaptive processing of data structures

IEEE Transactions on Neural Networks
Cluster based feedback provision strategies in intelligent tutoring systems

ITS'12 Proceedings of the 11th international conference on Intelligent Tutoring Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

For complex data sets, the pairwise similarity or dissimilarity of data often serves as the interface of the application scenario to the machine learning tool. Hence, the final result of training is severely influenced by the choice of the dissimilarity measure. While dissimilarity measures for supervised settings can eventually be compared by the classification error, the situation is less clear in unsupervised domains where a clear objective is lacking. The question occurs, how to compare dissimilarity measures and their influence on the final result in such cases. In this contribution, we propose to use a recent quantitative measure introduced in the context of unsupervised dimensionality reduction, to compare whether and on which scale dissimilarities coincide for an unsupervised learning task. Essentially, the measure evaluates in how far neighborhood relations are preserved if evaluated based on rankings, this way achieving a robustness of the measure against scaling of data. Apart from a global comparison, local versions allow to highlight regions of the data where two dissimilarity measures induce the same results.