Practical Algorithms for On-line Sampling

Authors:
Carlos Domingo;Ricard Gavaldà;Osamu Watanabe
Affiliations:
-;-;-
Venue:
DS '98 Proceedings of the First International Conference on Discovery Science
Year:
1998

Citing 7
Cited 8

Maximizing the predictive value of production rules

Artificial Intelligence
Toward efficient agnostic learning

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Decision theoretic generalizations of the PAC model for neural net and other learning applications

Information and Computation
Very Simple Classification Rules Perform Well on Most Commonly Used Datasets

Machine Learning
An introduction to computational learning theory

An introduction to computational learning theory
Bagging predictors

Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995

Sequential Sampling Algorithms: Unified Analysis and Lower Bounds

SAGA '01 Proceedings of the International Symposium on Stochastic Algorithms: Foundations and Applications
From Computational Learning Theory to Discovery Science

ICAL '99 Proceedings of the 26th International Colloquium on Automata, Languages and Programming
Scaling Up a Boosting-Based Learner via Adaptive Sampling

PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
On a Generalized Ruin Problem

APPROX '01/RANDOM '01 Proceedings of the 4th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems and 5th International Workshop on Randomization and Approximation Techniques in Computer Science: Approximation, Randomization and Combinatorial Optimization
How Can Computer Science Contribute to Knowledge Discovery?

SOFSEM '01 Proceedings of the 28th Conference on Current Trends in Theory and Practice of Informatics Piestany: Theory and Practice of Informatics
Faster Near-Optimal Reinforcement Learning: Adding Adaptiveness to the E3 Algorithm

ALT '99 Proceedings of the 10th International Conference on Algorithmic Learning Theory
Adaptive Sampling Methods for Scaling Up Knowledge Discovery Algorithms

DS '99 Proceedings of the Second International Conference on Discovery Science
Sequential Sampling Techniques for Algorithmic Learning Theory

ALT '00 Proceedings of the 11th International Conference on Algorithmic Learning Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the core applications of machine learning to knowledge discovery is building a hypothesis (such as a decision tree or neural network) from a given amount of data, so that we can later use it to predict new instances of the data. In this paper, we focus on a particular situation where we assume that the hypothesis we want to use for prediction is a very simple one so the hypotheses class is of feasible size. We study the problem of how to determine which of the hypotheses in the class is almost the best one. We present two on-line sampling algorithms for selecting a hypothesis, give theoretical bounds on the number of examples needed, and analyze them experimentally. We compare them with the simple batch sampling approach commonly used and show that in most of the situations our algorithms use a much smlaler number of examples.