Estimating the accuracy of learned concepts

Authors:
Timothy L. Bailey;Charles Elkan
Affiliations:
Department of Computer Science and Engineering, University of California, San Diego, La Jolla, California;Department of Computer Science and Engineering, University of California, San Diego, La Jolla, California
Venue:
IJCAI'93 Proceedings of the 13th international joint conference on Artifical intelligence - Volume 2
Year:
1993

Citing 6
Cited 2

Quantifying inductive bias: AI learning algorithms and Valiant's learning framework

Artificial Intelligence
Extensions to the CART algorithm

International Journal of Man-Machine Studies
Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems

Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems
Information-Based Evaluation Criterion for Classifier's Performance

Machine Learning
Small Sample Error Rate Estimation for k-NN Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning Logical Definitions from Relations

Machine Learning

Evaluating learning algorithms and classifiers

International Journal of Intelligent Information and Database Systems
Wrapper feature selection for small sample size data driven by complete error estimates

Computer Methods and Programs in Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper investigates alternative estimators of the accuracy of concepts learned from examples. In particular, the cross-validation and 632 bootstrap estimators are studied, using synthetic training data and the FOIL learning algorithm. Our experimental results contradict previous papers in statistics, which advocate the 632 bootstrap method as superior to cross-validation. Nevertheless, our results also suggest that conclusions based on cross-validation in previous machine learning papers are unreliable. Specifically, our observations are that (i) the true error of the concept learned by FOIL from independently drawn sets of examples of the same concept varies widely, (ii) the estimate of true error provided by cross-validation has high variability but is approximately unbiased, and (iii) the 632 bootstrap estimator has lower variability than cross-validation, but is systematically biased.