A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Machine Learning
Feature Selection: Evaluation, Application, and Small Sample Performance
IEEE Transactions on Pattern Analysis and Machine Intelligence
Selection of relevant features and examples in machine learning
Artificial Intelligence - Special issue on relevance
Best bands selection for detection in hyperspectral processing
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 2001. on IEEE International Conference - Volume 05
On the mean accuracy of statistical pattern recognizers
IEEE Transactions on Information Theory
Probability of error, equivocation, and the Chernoff bound
IEEE Transactions on Information Theory
Relations between entropy and error probability
IEEE Transactions on Information Theory
Input feature selection for classification problems
IEEE Transactions on Neural Networks
Using mutual information for selecting features in supervised neural net learning
IEEE Transactions on Neural Networks
A Fast Supervised Method of Feature Ranking and Selection for Pattern Classification
PReMI '09 Proceedings of the 3rd International Conference on Pattern Recognition and Machine Intelligence
A fast band selection method to increase image contrast for multispectral image segmentation
ISBI'09 Proceedings of the Sixth IEEE international conference on Symposium on Biomedical Imaging: From Nano to Macro
Individual optimal feature selection based on comprehensive evaluation indexs
ICIC'12 Proceedings of the 8th international conference on Intelligent Computing Theories and Applications
Engineering Applications of Artificial Intelligence
The Visual Computer: International Journal of Computer Graphics
Hi-index | 0.01 |
Because of the difficulty of obtaining an analytic expression for Bayes error, a wide variety of separability measures has been proposed for feature selection. In this paper, we show that there is a general framework based on the criterion of mutual information (MI) that can provide a realistic solution to the problem of feature selection for high-dimensional data. We give a theoretical argument showing that the MI of multi-dimensional data can be broken down into several one-dimensional components, which makes numerical evaluation much easier and more accurate. It also reveals that selection based on the simple criterion of only retaining features with high associated MI values may be problematic when the features are highly correlated. Although there is a direct way of selecting features by jointly maximising MI, this suffers from combinatorial explosion. Hence, we propose a fast feature-selection scheme based on a 'greedy' optimisation strategy. To confirm the effectiveness of this scheme, simulations are carried out on 16 land-cover classes using the 92AV3C data set collected from the 220-dimensional AVIRIS hyperspectral sensor. We replicate our earlier positive results (which used an essentially heuristic method for MI-based band-selection) but with much reduced computational cost and a much sounder theoretical basis.