Introduction to statistical pattern recognition (2nd ed.)
Introduction to statistical pattern recognition (2nd ed.)
Elements of information theory
Elements of information theory
Numerical recipes in C (2nd ed.): the art of scientific computing
Numerical recipes in C (2nd ed.): the art of scientific computing
Machine Learning
k-order additive discrete fuzzy measures and their representation
Fuzzy Sets and Systems - Special issue on fuzzy measures and integrals
An estimator of the mutual information based on a criterion for independence
Computational Statistics & Data Analysis
Equivalent Representations of Set Functions
Mathematics of Operations Research
Feature Extraction, Construction and Selection: A Data Mining Perspective
Feature Extraction, Construction and Selection: A Data Mining Perspective
Feature Selection for Knowledge Discovery and Data Mining
Feature Selection for Knowledge Discovery and Data Mining
Modeling interaction phenomena using fuzzy measures: on the notions of interaction and independence
Fuzzy Sets and Systems - Non-additive measures and random processes
Nonparametric multivariate density estimation: a comparative study
IEEE Transactions on Signal Processing
An axiomatic approach of the discrete Choquet integral as a tool to aggregate interacting criteria
IEEE Transactions on Fuzzy Systems
EURASIP Journal on Applied Signal Processing
Keyword search for data-centric XML collections with long text fields
Proceedings of the 13th International Conference on Extending Database Technology
Linear projection method based on information theoretic learning
ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part III
Using structural information in XML keyword search effectively
ACM Transactions on Database Systems (TODS)
On the use of variable complementarity for feature selection in cancer classification
EuroGP'06 Proceedings of the 2006 international conference on Applications of Evolutionary Computing
Low bias histogram-based estimation of mutual information for feature selection
Pattern Recognition Letters
Computers and Electrical Engineering
Hi-index | 0.03 |
In the framework of subset variable selection for regression, relevance measures based on the notion of mutual information are studied. Results on the estimation of this index of stochastic dependence in a continuous setting are first presented. They are grounded on kernel density estimation which makes the overall estimation of the mutual information quadratic. The behavior of the mutual information as a relevance measure is then empirically studied on several regression problems. The considered problems are artificially generated to contain irrelevant and redundant candidate explanatory variables as well as strongly nonlinear relationships. Next, still in a subset variable selection context, computationally more efficient approximations of the mutual information based on the notion of k-additive truncation are proposed. The 2- and 3-additive truncations appear to be of practical interest as relevance measures. The 2-additive truncation is based on the computation of the approximate relevance of a set of potential predictors from the relevance values of the singletons and pairs it contains. The 3-additive truncation additionally involves the relevance values of the 3-element subsets. The lower the amount of redundancy among the candidate explanatory variables, the better these approximations. The sample behavior of the two resulting relevance measures is finally empirically studied on the previously generated nonlinear artificial regression problems.