A linear discriminant analysis method based on mutual information maximization

Authors:
Haihong Zhang;Cuntai Guan;Yuanqing Li
Affiliations:
Institute for Infocomm Research, A*STAR, Singapore 138632, Singapore;Institute for Infocomm Research, A*STAR, Singapore 138632, Singapore;School of Automation Science and Technology, South China University of Technology, Guangzhou 510460, China
Venue:
Pattern Recognition
Year:
2011

Citing 26
Cited 0

Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
Alignment by Maximization of Mutual Information

International Journal of Computer Vision
Natural gradient works efficiently in learning

Neural Computation
Nonlinear component analysis as a kernel eigenvalue problem

Neural Computation
Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition

Speech Communication
Fractional-Step Dimensionality Reduction

IEEE Transactions on Pattern Analysis and Machine Intelligence
Information-theoretic algorithm for feature selection

Pattern Recognition Letters
Multiclass Linear Dimension Reduction by Weighted Pairwise Fisher Criteria

IEEE Transactions on Pattern Analysis and Machine Intelligence
Input Feature Selection by Mutual Information Based on Parzen Window

IEEE Transactions on Pattern Analysis and Machine Intelligence
Feature extraction by non parametric mutual information maximization

The Journal of Machine Learning Research
Linear Dimensionality Reduction via a Heteroscedastic Extension of LDA: The Chernoff Criterion

IEEE Transactions on Pattern Analysis and Machine Intelligence
Where Are Linear Feature Extraction Methods Applicable?

IEEE Transactions on Pattern Analysis and Machine Intelligence
An analysis of entropy estimators for blind source separation

Signal Processing
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)

Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Feature Extraction Using Information-Theoretic Learning

IEEE Transactions on Pattern Analysis and Machine Intelligence
Bayes Optimality in Linear Discriminant Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
KPCA for semantic object extraction in images

Pattern Recognition
Binary sparse nonnegative matrix factorization

IEEE Transactions on Circuits and Systems for Video Technology
Normalized mutual information feature selection

IEEE Transactions on Neural Networks
A feature extraction method for use with bimodal biometrics

Pattern Recognition
Deterministic Column-Based Matrix Decomposition

IEEE Transactions on Knowledge and Data Engineering
Supervised feature selection by clustering using conditional mutual information-based distances

Pattern Recognition
Laplacian regularized D-optimal design for active learning and its application to image retrieval

IEEE Transactions on Image Processing
Nonparametric Discriminant Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
The particle swarm - explosion, stability, and convergence in amultidimensional complex space

IEEE Transactions on Evolutionary Computation
Generalized information potential criterion for adaptive system training

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.01

Visualization

Abstract

We present a new linear discriminant analysis method based on information theory, where the mutual information between linearly transformed input data and the class labels is maximized. First, we introduce a kernel-based estimate of mutual information with a variable kernel size. Furthermore, we devise a learning algorithm that maximizes the mutual information w.r.t. the linear transformation. Two experiments are conducted: the first one uses a toy problem to visualize and compare the transformation vectors in the original input space; the second one evaluates the performance of the method for classification by employing cross-validation tests on four datasets from the UCI repository. Various classifiers are investigated. Our results show that this method can significantly boost class separability over conventional methods, especially for nonlinear classification.