Sharper Bounds for the Hardness of Prototype and Feature Selection

Authors:
Richard Nock;Marc Sebban
Affiliations:
-;-
Venue:
ALT '00 Proceedings of the 11th International Conference on Algorithmic Learning Theory
Year:
2000

Citing 15
Cited 1

C4.5: programs for machine learning

C4.5: programs for machine learning
On the limits of proper learnability of subclasses of DNF formulas

COLT '94 Proceedings of the seventh annual conference on Computational learning theory
An introduction to computational learning theory

An introduction to computational learning theory
On the boosting ability of top-down decision tree learning algorithms

STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Selection of relevant features and examples in machine learning

Artificial Intelligence - Special issue on relevance
Improved boosting algorithms using confidence-rated predictions

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Machine Learning

Machine Learning
Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties

Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties
Instance Pruning Techniques

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
On the Power of Decision Lists

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Generalized Graph Colorability and Compressibility of Boolean Formulae

ISAAC '98 Proceedings of the 9th International Symposium on Algorithms and Computation
Combining Feature and Example Pruning by Uncertainty Minimization

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Function-Free Horn Clauses Are Hard to Approximate

ILP '98 Proceedings of the 8th International Workshop on Inductive Logic Programming
On the Hardness of Approximating Max k-Cut and Its Dual

On the Hardness of Approximating Max k-Cut and Its Dual
Approximation algorithms for combinatorial problems

Journal of Computer and System Sciences

Detecting Irrelevant Subtrees to Improve Probabilistic Learning from Tree-structured Data

Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences

Quantified Score

Hi-index	0.00

Visualization

Abstract

As pointed out by Blum [Blu94], "nearly all results in Machine Learning [...] deal with problems of separating relevant from irrelevant information in some way". This paper is concerned with structural complexity issues regarding the selection of relevant Prototypes or Features. We give the first results proving that both problems can be much harder than expected in the literature for various notions of relevance. In particular, the worst-case bounds achievable by any efficient algorithm are proven to be very large, most of the time not so far from trivial bounds. We think these results give a theoretical justification for the numerous heuristic approaches found in the literature to cope with these problems.