A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Nonlinear component analysis as a kernel eigenvalue problem
Neural Computation
Discovering hidden features with Gaussian processes regression
Proceedings of the 1998 conference on Advances in neural information processing systems II
Bayesian Learning for Neural Networks
Bayesian Learning for Neural Networks
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Kernel independent component analysis
The Journal of Machine Learning Research
An introduction to variable and feature selection
The Journal of Machine Learning Research
Feature extraction by non parametric mutual information maximization
The Journal of Machine Learning Research
Beyond independent components: trees and clusters
The Journal of Machine Learning Research
The Bayesian backfitting relevance vector machine
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Supervised dimensionality reduction using mixture models
ICML '05 Proceedings of the 22nd international conference on Machine learning
Supervised probabilistic principal component analysis
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
A Fast Feature-based Dimension Reduction Algorithm for Kernel Classifiers
Neural Processing Letters
Multi-Output Regularized Feature Projection
IEEE Transactions on Knowledge and Data Engineering
Kernel Methods for Measuring Independence
The Journal of Machine Learning Research
A penalized criterion for variable selection in classification
Journal of Multivariate Analysis
Statistical Consistency of Kernel Canonical Correlation Analysis
The Journal of Machine Learning Research
Regression on manifolds using kernel dimension reduction
Proceedings of the 24th international conference on Machine learning
A dependence maximization view of clustering
Proceedings of the 24th international conference on Machine learning
Supervised feature selection via dependence estimation
Proceedings of the 24th international conference on Machine learning
A kernel-based causal learning algorithm
Proceedings of the 24th international conference on Machine learning
Consistency of the Group Lasso and Multiple Kernel Learning
The Journal of Machine Learning Research
Learning subspace kernels for classification
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
A Hilbert Space Embedding for Distributions
ALT '07 Proceedings of the 18th international conference on Algorithmic Learning Theory
Hilbert space embeddings of conditional distributions with applications to dynamical systems
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Cutting-plane training of structural SVMs
Machine Learning
Finite sample error bound for Parzen windows
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Dimension reduction using the generalized gradient direction
Computational Statistics & Data Analysis
Supervised feature extraction using Hilbert-Schmidt norms
IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
Consistent Nonparametric Tests of Independence
The Journal of Machine Learning Research
Hilbert Space Embeddings and Metrics on Probability Measures
The Journal of Machine Learning Research
Learning Gradients: Predictive Models that Infer Geometry and Statistical Dependence
The Journal of Machine Learning Research
Feature Selection by Approximating the Markov Blanket in a Kernel-Induced Space
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Dimension reduction for regression with bottleneck neural networks
IDEAL'10 Proceedings of the 11th international conference on Intelligent data engineering and automated learning
Characteristic kernels on structured domains excel in robotics and human action recognition
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Incorporating the loss function into discriminative clustering of structured outputs
IEEE Transactions on Neural Networks
Novel kernel-based recognizers of human actions
EURASIP Journal on Advances in Signal Processing - Special issue on video analysis for human behavior understanding
Guided Locally Linear Embedding
Pattern Recognition Letters
An Iterative Method for Deciding SVM and Single Layer Neural Network Structures
Neural Processing Letters
Universality, Characteristic Kernels and RKHS Embedding of Measures
The Journal of Machine Learning Research
Measuring statistical dependence with hilbert-schmidt norms
ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
The Journal of Machine Learning Research
Neural Computation
Feature selection via dependence maximization
The Journal of Machine Learning Research
Feature ranking fusion for text classifier
Intelligent Data Analysis
Hi-index | 0.00 |
We propose a novel method of dimensionality reduction for supervised learning problems. Given a regression or classification problem in which we wish to predict a response variable Y from an explanatory variable X, we treat the problem of dimensionality reduction as that of finding a low-dimensional "effective subspace" for X which retains the statistical relationship between X and Y. We show that this problem can be formulated in terms of conditional independence. To turn this formulation into an optimization problem we establish a general nonparametric characterization of conditional independence using covariance operators on reproducing kernel Hilbert spaces. This characterization allows us to derive a contrast function for estimation of the effective subspace. Unlike many conventional methods for dimensionality reduction in supervised learning, the proposed method requires neither assumptions on the marginal distribution of X, nor a parametric model of the conditional distribution of Y. We present experiments that compare the performance of the method with conventional methods.