Clustering based on conditional distributions in an auxiliary space

Authors:
Janne Sinkkonen;Samuel Kaski
Affiliations:
Neural Networks Research Centre, Helsinki University of Technology, FIN-02015 HUT, Finland;Neural Networks Research Centre, Helsinki University of Technology, FIN-02015 HUT, Finland
Venue:
Neural Computation
Year:
2002

Citing 12
Cited 34

Self-organization and associative memory: 3rd edition

Self-organization and associative memory: 3rd edition
Maximum likelihood competitive learning

Advances in neural information processing systems 2
Elements of information theory

Elements of information theory
Physiological interpretation of the Self-Organizing Map algorithm

Neural Networks
1994 Special Issue: Winner-take-all networks for physiological models of competitive learning

Neural Networks - Special issue: models of neurodynamics and behavior
Self-organizing maps

Self-organizing maps
Learning from dyadic data

Proceedings of the 1998 conference on Advances in neural information processing systems II
Exploiting generative models in discriminative classifiers

Proceedings of the 1998 conference on Advances in neural information processing systems II
Flexible discriminant and mixture models

Statistics and neural networks
Mutual Information in Learning Feature Transformations

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Distributional clustering of English words

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Bankruptcy analysis with self-organizing maps in learning metrics

IEEE Transactions on Neural Networks

Discriminative Clustering: Optimal Contingency Tables by Learning Metrics

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Clustering Gene Expression Data by Mutual Information with Gene Function

ICANN '01 Proceedings of the International Conference on Artificial Neural Networks
Learning More Accurate Metrics for Self-Organizing Maps

ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
Generalized relevance learning vector quantization

Neural Networks - New developments in self-organizing maps
Analysis and Visualization of Gene Expression Microarray Data in Human Cancer Using Self-Organizing Maps

Machine Learning
Generative model-based clustering of directional data

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A unified framework for model-based clustering

The Journal of Machine Learning Research
Principle of Learning Metrics for Exploratory Data Analysis

Journal of VLSI Signal Processing Systems
Locally linear metric adaptation for semi-supervised clustering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Information Bottleneck for Gaussian Variables

The Journal of Machine Learning Research
Associative Clustering for Exploring Dependencies between Functional Genomics Data Sets

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Recursive self-organizing network models

Neural Networks - 2004 Special issue: New developments in self-organizing systems
Improved learning of Riemannian metrics for exploratory analysis

Neural Networks - 2004 Special issue: New developments in self-organizing systems
Self-organizing maps and clustering methods for matrix data

Neural Networks - 2004 Special issue: New developments in self-organizing systems
Fuzzy classification by fuzzy labeled neural gas

Neural Networks - 2006 Special issue: Advances in self-organizing maps--WSOM'05
Locally linear metric adaptation with application to semi-supervised clustering and image retrieval

Pattern Recognition
Relaxational metric adaptation and its application to semi-supervised clustering and content-based image retrieval

Pattern Recognition
Fuzzy vector quantization with the particle swarm optimization: A study in fuzzy granulation-degranulation information processing

Signal Processing
Growing kernel-based self-organized maps trained with supervised bias

Intelligent Data Analysis
Assessment of self-organizing map variants for clustering with application to redistribution of emotional speech patterns

Neurocomputing
Mutual information clustering for efficient mining of fuzzy association rules with application to gene expression data analysis

ICCOMP'05 Proceedings of the 9th WSEAS International Conference on Computers
Probabilistic approach to detecting dependencies between data sets

Neurocomputing
Semi-supervised graph clustering: a kernel approach

Machine Learning
Merge SOM for temporal data

Neurocomputing
Unsupervised recursive sequence processing

Neurocomputing
Discriminative clustering

Neurocomputing
Computational intelligence in biomedical imaging: multidimensional analysis of spatio-temporal patterns

Computer Science - Research and Development
Clustering with kernel-based self-organized maps trained with supervised bias

SIP'06 Proceedings of the 5th WSEAS international conference on Signal processing
A unified probabilistic framework for clustering correlated heterogeneous web objects

APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Kernel-Based metric adaptation with pairwise constraints

ICMLC'05 Proceedings of the 4th international conference on Advances in Machine Learning and Cybernetics
Fuzzy labeled self-organizing map with label-adjusted prototypes

ANNPR'06 Proceedings of the Second international conference on Artificial Neural Networks in Pattern Recognition
How to "alternatize" a clustering algorithm

Data Mining and Knowledge Discovery
Active selection of clustering constraints: a sequential approach

Pattern Recognition
HMM-based hybrid meta-clustering ensemble for temporal data

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study the problem of learning groups or categories that are local in the continuous primary space but homogeneous by the distributions of an associated auxiliary random variable over a discrete auxiliary space. Assuming that variation in the auxiliary space is meaningful, categories will emphasize similarly meaningful aspects of the primary space. From a data set consisting of pairs of primary and auxiliary items, the categories are learned by minimizing a Kullback-Leibler divergence-based distortion between (implicitly estimated) distributions of the auxiliary data, conditioned on the primary data. Still, the categories are defined in terms of the primary space. An online algorithm resembling the traditional Hebb-type competitive learning is introduced for learning the categories. Minimizing the distortion criterion turns out to be equivalent to maximizing the mutual information between the categories and the auxiliary data. In addition, connections to density estimation and to the distributional clustering paradigm are outlined. The method is demonstrated by clustering yeast gene expression data from DNA chips, with biological knowledge about the functional classes of the genes as the auxiliary data.