Incorporating the loss function into discriminative clustering of structured outputs

Authors:
Wenliang Zhong;Weike Pan;James T. Kwok;Ivor W. Tsang
Affiliations:
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China;Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China;Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China;School of Computer Engineering, Nanyang Technological University, Singapore
Venue:
IEEE Transactions on Neural Networks
Year:
2010

Citing 19
Cited 0

Algorithms for clustering data

Algorithms for clustering data
Matrix computations (3rd ed.)

Matrix computations (3rd ed.)
Least Squares Support Vector Machine Classifiers

Neural Processing Letters
Ridge Regression Learning Algorithm in Dual Variables

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Support vector clustering

The Journal of Machine Learning Research
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
A survey of kernels for structured data

ACM SIGKDD Explorations Newsletter
Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces

The Journal of Machine Learning Research
Hierarchical document categorization with support vector machines

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Learning structured prediction models: a large margin approach

Learning structured prediction models: a large margin approach
Large Margin Methods for Structured and Interdependent Output Variables

The Journal of Machine Learning Research
Discriminative unsupervised learning of structured predictors

ICML '06 Proceedings of the 23rd international conference on Machine learning
Kernel-Based Learning of Hierarchical Multilabel Classification Models

The Journal of Machine Learning Research
Structural alignment based kernels for protein structure classification

Proceedings of the 24th international conference on Machine learning
A dependence maximization view of clustering

Proceedings of the 24th international conference on Machine learning
An introduction to nonlinear dimensionality reduction by maximum variance unfolding

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Unsupervised and semi-supervised multi-class support vector machines

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Measuring statistical dependence with hilbert-schmidt norms

ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
Mercer kernel-based clustering in feature space

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering using the Hilbert Schmidt independence criterion (CLUHSIC) is a recent clustering algorithm that maximizes the dependence between cluster labels and data observations according to the Hilbert Schmidt independence criterion (HSIC). It is unique in that structure information on the cluster outputs can be easily utilized in the clustering process. However, while the choice of the loss function is known to be very important in supervised learning with structured outputs, we will show in this paper that CLUHSIC is implicitly using the often inappropriate zero-one loss. We propose an extension called CLUHSICAL (which stands for "Clustering using HSIC and loss") which explicitly considers both the output dependency and loss function. Its optimization problem has the same form as CLUHSIC, except that its partition matrix is constructed in a different manner. Experimental results on a number of datasets with structured outputs show that CLUHSICAL often outperforms CLUHSIC in terms of both structured loss and clustering accuracy.