Sufficient dimension reduction via squared-loss mutual information estimation

Authors:
Taiji Suzuki;Masashi Sugiyama
Affiliations:
-;-
Venue:
Neural Computation
Year:
2013

Citing 16
Cited 1

Natural gradient works efficiently in learning

Neural Computation
The Geometry of Algorithms with Orthogonality Constraints

SIAM Journal on Matrix Analysis and Applications
Diffusion Kernels on Graphs and Other Discrete Input Spaces

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Kernels for Semi-Structured Data

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
On the influence of the kernel on the consistency of support vector machines

The Journal of Machine Learning Research
Text classification using string kernels

The Journal of Machine Learning Research
Feature extraction by non parametric mutual information maximization

The Journal of Machine Learning Research
A survey of kernels for structured data

ACM SIGKDD Explorations Newsletter
Convex Optimization

Convex Optimization
Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces

The Journal of Machine Learning Research
Edgeworth Approximation of Multivariate Differential Entropy

Neural Computation
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)

Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Supervised feature selection via dependence estimation

Proceedings of the 24th international conference on Machine learning
Estimating divergence functionals and the likelihood ratio by convex risk minimization

IEEE Transactions on Information Theory
Measuring statistical dependence with hilbert-schmidt norms

ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
Estimation of the information by an adaptive partitioning of the observation space

IEEE Transactions on Information Theory

Density-difference estimation

Neural Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

The goal of sufficient dimension reduction in supervised learning is to find the low-dimensional subspace of input features that contains all of the information about the output values that the input features possess. In this letter, we propose a novel sufficient dimension-reduction method using a squared-loss variant of mutual information as a dependency measure. We apply a density-ratio estimator for approximating squared-loss mutual information that is formulated as a minimum contrast estimator on parametric or nonparametric models. Since cross-validation is available for choosing an appropriate model, our method does not require any prespecified structure on the underlying distributions. We elucidate the asymptotic bias of our estimator on parametric models and the asymptotic convergence rate on nonparametric models. The convergence analysis utilizes the uniform tail-bound of a U-process, and the convergence rate is characterized by the bracketing entropy of the model. We then develop a natural gradient algorithm on the Grassmann manifold for sufficient subspace search. The analytic formula of our estimator allows us to compute the gradient efficiently. Numerical experiments show that the proposed method compares favorably with existing dimension-reduction approaches on artificial and benchmark data sets.