C4.5: programs for machine learning
C4.5: programs for machine learning
On the limits of proper learnability of subclasses of DNF formulas
COLT '94 Proceedings of the seventh annual conference on Computational learning theory
An introduction to computational learning theory
An introduction to computational learning theory
On the boosting ability of top-down decision tree learning algorithms
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Selection of relevant features and examples in machine learning
Artificial Intelligence - Special issue on relevance
Improved boosting algorithms using confidence-rated predictions
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Machine Learning
Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
On the Power of Decision Lists
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Generalized Graph Colorability and Compressibility of Boolean Formulae
ISAAC '98 Proceedings of the 9th International Symposium on Algorithms and Computation
Combining Feature and Example Pruning by Uncertainty Minimization
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Function-Free Horn Clauses Are Hard to Approximate
ILP '98 Proceedings of the 8th International Workshop on Inductive Logic Programming
On the Hardness of Approximating Max k-Cut and Its Dual
On the Hardness of Approximating Max k-Cut and Its Dual
Approximation algorithms for combinatorial problems
Journal of Computer and System Sciences
Detecting Irrelevant Subtrees to Improve Probabilistic Learning from Tree-structured Data
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Hi-index | 0.00 |
As pointed out by Blum [Blu94], "nearly all results in Machine Learning [...] deal with problems of separating relevant from irrelevant information in some way". This paper is concerned with structural complexity issues regarding the selection of relevant Prototypes or Features. We give the first results proving that both problems can be much harder than expected in the literature for various notions of relevance. In particular, the worst-case bounds achievable by any efficient algorithm are proven to be very large, most of the time not so far from trivial bounds. We think these results give a theoretical justification for the numerous heuristic approaches found in the literature to cope with these problems.