Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces

Authors:
Kenji Fukumizu;Francis R. Bach;Michael I. Jordan
Affiliations:
-;-;-
Venue:
The Journal of Machine Learning Research
Year:
2004

Citing 9
Cited 42

A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Nonlinear component analysis as a kernel eigenvalue problem

Neural Computation
Discovering hidden features with Gaussian processes regression

Proceedings of the 1998 conference on Advances in neural information processing systems II
Bayesian Learning for Neural Networks

Bayesian Learning for Neural Networks
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Kernel independent component analysis

The Journal of Machine Learning Research
An introduction to variable and feature selection

The Journal of Machine Learning Research
Feature extraction by non parametric mutual information maximization

The Journal of Machine Learning Research
Beyond independent components: trees and clusters

The Journal of Machine Learning Research

The Bayesian backfitting relevance vector machine

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Supervised dimensionality reduction using mixture models

ICML '05 Proceedings of the 22nd international conference on Machine learning
Supervised probabilistic principal component analysis

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
A Fast Feature-based Dimension Reduction Algorithm for Kernel Classifiers

Neural Processing Letters
Multi-Output Regularized Feature Projection

IEEE Transactions on Knowledge and Data Engineering
Kernel Methods for Measuring Independence

The Journal of Machine Learning Research
A penalized criterion for variable selection in classification

Journal of Multivariate Analysis
Statistical Consistency of Kernel Canonical Correlation Analysis

The Journal of Machine Learning Research
Regression on manifolds using kernel dimension reduction

Proceedings of the 24th international conference on Machine learning
A dependence maximization view of clustering

Proceedings of the 24th international conference on Machine learning
Supervised feature selection via dependence estimation

Proceedings of the 24th international conference on Machine learning
A kernel-based causal learning algorithm

Proceedings of the 24th international conference on Machine learning
Consistency of the Group Lasso and Multiple Kernel Learning

The Journal of Machine Learning Research
Learning subspace kernels for classification

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
A Hilbert Space Embedding for Distributions

ALT '07 Proceedings of the 18th international conference on Algorithmic Learning Theory
Hilbert space embeddings of conditional distributions with applications to dynamical systems

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Cutting-plane training of structural SVMs

Machine Learning
Finite sample error bound for Parzen windows

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Dimensionality reduction for density ratio estimation in high-dimensional spaces

Neural Networks
Dimension reduction using the generalized gradient direction

Computational Statistics & Data Analysis
Supervised feature extraction using Hilbert-Schmidt norms

IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
Consistent Nonparametric Tests of Independence

The Journal of Machine Learning Research
Hilbert Space Embeddings and Metrics on Probability Measures

The Journal of Machine Learning Research
Learning Gradients: Predictive Models that Infer Geometry and Statistical Dependence

The Journal of Machine Learning Research
Feature Selection by Approximating the Markov Blanket in a Kernel-Induced Space

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Dimension reduction for regression with bottleneck neural networks

IDEAL'10 Proceedings of the 11th international conference on Intelligent data engineering and automated learning
Characteristic kernels on structured domains excel in robotics and human action recognition

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Incorporating the loss function into discriminative clustering of structured outputs

IEEE Transactions on Neural Networks
Novel kernel-based recognizers of human actions

EURASIP Journal on Advances in Signal Processing - Special issue on video analysis for human behavior understanding
Supervised principal component analysis: Visualization, classification and regression on subspaces and submanifolds

Pattern Recognition
Guided Locally Linear Embedding

Pattern Recognition Letters
An Iterative Method for Deciding SVM and Single Layer Neural Network Structures

Neural Processing Letters
Universality, Characteristic Kernels and RKHS Embedding of Measures

The Journal of Machine Learning Research
Measuring statistical dependence with hilbert-schmidt norms

ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
Dimensionality reduction based on non-parametric mutual information

Neurocomputing
A kernel two-sample test

The Journal of Machine Learning Research
Conditional association

Neural Computation
Feature selection via dependence maximization

The Journal of Machine Learning Research
A semi-supervised approach for dimensionality reduction with distributional similarity

Neurocomputing
Sufficient dimension reduction via squared-loss mutual information estimation

Neural Computation
Graph-preserving shortest feature line segment for dimensionality reduction

Neurocomputing
Feature ranking fusion for text classifier

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a novel method of dimensionality reduction for supervised learning problems. Given a regression or classification problem in which we wish to predict a response variable Y from an explanatory variable X, we treat the problem of dimensionality reduction as that of finding a low-dimensional "effective subspace" for X which retains the statistical relationship between X and Y. We show that this problem can be formulated in terms of conditional independence. To turn this formulation into an optimization problem we establish a general nonparametric characterization of conditional independence using covariance operators on reproducing kernel Hilbert spaces. This characterization allows us to derive a contrast function for estimation of the effective subspace. Unlike many conventional methods for dimensionality reduction in supervised learning, the proposed method requires neither assumptions on the marginal distribution of X, nor a parametric model of the conditional distribution of Y. We present experiments that compare the performance of the method with conventional methods.