Process-oriented estimation of generalization error

Authors:
Pedro Domingos
Affiliations:
Artificial Intelligence Group, Instituto Superior Tecnico, Lisbon, Portugal
Venue:
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Year:
1999

Citing 15
Cited 5

Boolean Feature Discovery in Empirical Learning

Machine Learning
An ounce of knowledge is worth a ton of data: quantitative studies of the trade-off between expertise and data based on statistically well-founded empirical induction

Proceedings of the sixth international workshop on Machine learning
Rule induction with CN2: some recent improvements

EWSL-91 Proceedings of the European working session on learning on Machine learning
C4.5: programs for machine learning

C4.5: programs for machine learning
Efficient noise-tolerant learning from statistical queries

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
An introduction to computational learning theory

An introduction to computational learning theory
The nature of statistical learning theory

The nature of statistical learning theory
Efficient Approximations for the MarginalLikelihood of Bayesian Networks with Hidden Variables

Machine Learning - Special issue on learning with probabilistic representations
Self bounding learning algorithms

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Multiple Comparisons in Induction Algorithms

Machine Learning
The CN2 Induction Algorithm

Machine Learning
On Estimating Probabilities in Tree Pruning

EWSL '91 Proceedings of the European Working Session on Machine Learning
Preventing "Overfitting" of Cross-Validation Data

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Process-Oriented Heuristic for Model Selection

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Oversearching and layered search in empirical learning

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

Beyond Occam's Razor: Process-Oriented Evaluation

ECML '00 Proceedings of the 11th European Conference on Machine Learning
Average-Case Analysis of Classification Algorithms for Boolean Functions and Decision Trees

ALT '00 Proceedings of the 11th International Conference on Algorithmic Learning Theory
The Biases of Decision Tree Pruning Strategies

IDA '99 Proceedings of the Third International Symposium on Advances in Intelligent Data Analysis
Logical-shapelets: an expressive primitive for time series classification

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Feature over-selection

SSPR'06/SPR'06 Proceedings of the 2006 joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Methods to avoid overfitting fall into two broad categories: data-oriented (using separate data for validation) and representation-oriented (penalizing complexity in the model). Both have limitations that are hard to overcome. We argue that fully adequate model evaluation is only possible if the search process by which models are obtained is also taken into account. To this end, we recently proposed a method for process-oriented evaluation (P0E), and successfully applied it to rule induction [Domingos, 1998b]. However, for the sake of simplicity this treatment made a number of rather artificial assumptions. In this paper the assumptions are removed, and a simple formula for error estimation is obtained. Empirical trials show the new, better-founded form of POE to be as accurate as the previous one, while further reducing theory sizes.