Principle of Learning Metrics for Exploratory Data Analysis

Authors:
Samuel Kaski;Janne Sinkkonen
Affiliations:
Neural Networks Research Centre, Helsinki University of Technology, P.O. Box 5400, FIN-02015 HUT, Finland;Neural Networks Research Centre, Helsinki University of Technology, P.O. Box 5400, FIN-02015 HUT, Finland
Venue:
Journal of VLSI Signal Processing Systems
Year:
2004

Citing 8
Cited 9

Natural gradient works efficiently in learning

Neural Computation
Learning from dyadic data

Proceedings of the 1998 conference on Advances in neural information processing systems II
Flexible discriminant and mixture models

Statistics and neural networks
Self-Organizing Maps

Self-Organizing Maps
Clustering based on conditional distributions in an auxiliary space

Neural Computation
Mutual Information in Learning Feature Transformations

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Distributional clustering of English words

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Bankruptcy analysis with self-organizing maps in learning metrics

IEEE Transactions on Neural Networks

Guest Editorial for Special Issue on Machine Learning for Signal Processing

Journal of VLSI Signal Processing Systems
Improved learning of Riemannian metrics for exploratory analysis

Neural Networks - 2004 Special issue: New developments in self-organizing systems
Batch and median neural gas

Neural Networks - 2006 Special issue: Advances in self-organizing maps--WSOM'05
Integration of well posedness analysis in software engineering

Proceedings of the 2007 ACM symposium on Applied computing
Patient stratification with competing risks by multivariate fisher distance

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Discriminative clustering

Neurocomputing
Information Retrieval Perspective to Nonlinear Dimensionality Reduction for Data Visualization

The Journal of Machine Learning Research
Visualizing multidimensional data through multilayer perceptron maps

ICANNGA'11 Proceedings of the 10th international conference on Adaptive and natural computing algorithms - Volume Part I
Supervised batch neural gas

ANNPR'06 Proceedings of the Second international conference on Artificial Neural Networks in Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Visualization and clustering of multivariate data are usually based on mutual distances of samples, measured by heuristic means such as the Euclidean distance of vectors of extracted features. Our recently developed methods remove this arbitrariness by learning to measure important differences. The effect is equivalent to changing the metric of the data space. It is assumed that variation of the data is important only to the extent it causes variation in auxiliary data which is available paired to the primary data. The learning of the metric is supervised by the auxiliary data, whereas the data analysis in the new metric is unsupervised. We review two approaches: a clustering algorithm and another that is based on an explicitly generated metric. Applications have so far been in exploratory analysis of texts, gene function, and bankruptcy. Relationships of the two approaches are derived, which leads to new promising approaches to the clustering problem.