Semi-supervised Laplacian Regularization of Kernel Canonical Correlation Analysis

Authors:
Matthew B. Blaschko;Christoph H. Lampert;Arthur Gretton
Affiliations:
Department of Empirical Inference, Max Planck Institute for Biological Cybernetics, Tübingen, Germany 72076;Department of Empirical Inference, Max Planck Institute for Biological Cybernetics, Tübingen, Germany 72076;Department of Empirical Inference, Max Planck Institute for Biological Cybernetics, Tübingen, Germany 72076
Venue:
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Year:
2008

Citing 10
Cited 8

Information Retrieval

Information Retrieval
Kernel independent component analysis

The Journal of Machine Learning Research
Canonical Correlation Analysis: An Overview with Application to Learning Methods

Neural Computation
Using KCCA for Japanese---English cross-language information retrieval and document classification

Journal of Intelligent Information Systems
Kernel Methods for Measuring Independence

The Journal of Machine Learning Research
Accurate Error Bounds for the Eigenvalues of the Kernel Matrix

The Journal of Machine Learning Research
Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples

The Journal of Machine Learning Research
Statistical Consistency of Kernel Canonical Correlation Analysis

The Journal of Machine Learning Research
Discriminating image senses by clustering with multimodal features

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
SURF: speeded up robust features

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I

Exploiting tag and word correlations for improved webpage clustering

SMUC '10 Proceedings of the 2nd international workshop on Search and mining user-generated contents
Semi-supervised kernel canonical correlation analysis with application to human fMRI

Pattern Recognition Letters
Trace ratio criterion based generalized discriminative learning for semi-supervised dimensionality reduction

Pattern Recognition
A unified dimensionality reduction framework for semi-paired and semi-supervised multi-view data

Pattern Recognition
Dimensionality reduction by Mixed Kernel Canonical Correlation Analysis

Pattern Recognition
Leveraging Social Bookmarks from Partially Tagged Corpus for Improved Web Page Clustering

ACM Transactions on Intelligent Systems and Technology (TIST)
Neighborhood Correlation Analysis for Semi-paired Two-View Data

Neural Processing Letters
Multi-view classification with cross-view must-link and cannot-link side information

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Kernel canonical correlation analysis (KCCA) is a fundamental technique for dimensionality reduction for paired data. By finding directions that maximize correlation in the space implied by the kernel, KCCA is able to learn representations that are more closely tied to the underlying semantics of the data rather than high variance directions, which are found by PCA but may be the result of noise. However, meaningful directions are not only those that have high correlation to another modality, but also those that capture the manifold structure of the data. We propose a method that is able to simultaneously find highly correlated directions that are also located on high variance directions along the data manifold. This is achieved by the use of semi-supervised Laplacian regularization in the formulation of KCCA, which has the additional benefit of being able to use additional data for which correspondence between the modalities is not known to more robustly estimate the structure of the data manifold. We show experimentally on datasets of images and text that Laplacian regularized training improves the class separation over KCCA with only Tikhonov regularization, while causing no degradation in the correlation between modalities. We propose a model selection criterion based on the Hilbert-Schmidt norm of the semi-supervised Laplacian regularized cross-covariance operator, which can be computed in closed form. Kernel canonical correlation analysis (KCCA) is a dimensionality reduction technique for paired data. By finding directions that maximize correlation, KCCA learns representations that are more closely tied to the underlying semantics of the data rather than noise. However, meaningful directions are not only those that have high correlation to another modality, but also those that capture the manifold structure of the data. We propose a method that is simultaneously able to find highly correlated directions that are also located on high variance directions along the data manifold. This is achieved by the use of semi-supervised Laplacian regularization of KCCA. We show experimentally that Laplacian regularized training improves class separation over KCCA with only Tikhonov regularization, while causing no degradation in the correlation between modalities. We propose a model selection criterion based on the Hilbert-Schmidt norm of the semi-supervised Laplacian regularized cross-covariance operator, which we compute in closed form.