Learning and classifying under hard budgets

Authors:
Aloak Kapoor;Russell Greiner
Affiliations:
Department of Computing Science, University of Alberta, Edmonton, AB;Department of Computing Science, University of Alberta, Edmonton, AB
Venue:
ECML'05 Proceedings of the 16th European conference on Machine Learning
Year:
2005

Citing 11
Cited 21

Sequential PAC learning

COLT '95 Proceedings of the eighth annual conference on Computational learning theory
A tutorial on learning with Bayesian networks

Learning in graphical models
Efficient progressive sampling

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning cost-sensitive active classifiers

Artificial Intelligence
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
Active Feature-Value Acquisition for Classifier Induction

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Active model selection

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Reinforcement learning for active model selection

UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Learning when training data are costly: the effect of class distribution on tree induction

Journal of Artificial Intelligence Research
Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm

Journal of Artificial Intelligence Research
Budgeted learning of nailve-bayes classifiers

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

Reinforcement learning for active model selection

UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Maximizing classifier utility when training data is costly

ACM SIGKDD Explorations Newsletter
Partial example acquisition in cost-sensitive learning

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Data acquisition and cost-effective predictive modeling: targeting offers for electronic commerce

Proceedings of the ninth international conference on Electronic commerce
Active learning with direct query construction

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Get another label? improving data quality and data mining using multiple, noisy labelers

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Maximizing classifier utility when there are data acquisition and modeling costs

Data Mining and Knowledge Discovery
Bellwether analysis: Searching for cost-effective query-defined predictors in large databases

ACM Transactions on Knowledge Discovery from Data (TKDD)
Efficiently learning the accuracy of labeling sources for selective sampling

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Improving author coreference by resource-bounded information gathering from the web

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Resource-bounded information gathering for correlation clustering

COLT'07 Proceedings of the 20th annual conference on Learning theory
A decision support system for cost-effective diagnosis

Artificial Intelligence in Medicine
Paradoxes in Learning and the Marginal Value of Information

Decision Analysis
Interactive learning for efficiently detecting errors in insurance claims

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast data acquisition in cost-sensitive learning

ICDM'11 Proceedings of the 11th international conference on Advances in data mining: applications and theoretical aspects
ACE-Cost: acquisition cost efficient classifier by hybrid decision tree with local SVM leaves

MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Efficient Learning with Partially Observed Attributes

The Journal of Machine Learning Research
Consistency of Sequential Bayesian Sampling Policies

SIAM Journal on Control and Optimization
New algorithms for budgeted learning

Machine Learning
Intelligently querying incomplete instances for improving classification performance

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Repeated labeling using multiple noisy labelers

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Since resources for data acquisition are seldom infinite, both learners and classifiers must act intelligently under hard budgets. In this paper, we consider problems in which feature values are unknown to both the learner and classifier, but can be acquired at a cost. Our goal is a learner that spends its fixed learning budget bL acquiring training data, to produce the most accurate “active classifier” that spends at most bC per instance. To produce this fixed-budget classifier, the fixed-budget learner must sequentially decide which feature values to collect to learn the relevant information about the distribution. We explore several approaches the learner can take, including the standard “round robin” policy (purchasing every feature of every instance until the bL budget is exhausted). We demonstrate empirically that round robin is problematic (especially for small bL), and provide alternate learning strategies that achieve superior performance on a variety of datasets.