Economical active feature-value acquisition through Expected Utility estimation

Authors:
Prem Melville;Foster Provost;Maytal Saar-Tsechansky;Raymond Mooney
Affiliations:
Univ. of Texas at Austin;New York University;Univ. of Texas at Austin;Univ. of Texas at Austin
Venue:
UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Year:
2005

Citing 11
Cited 18

Statistical analysis with missing data

Statistical analysis with missing data
C4.5: programs for machine learning

C4.5: programs for machine learning
Improving Generalization with Active Learning

Machine Learning - Special issue on structured connectionist systems
Selective Sampling Using the Query by Committee Algorithm

Machine Learning
Selective sampling for nearest neighbor classifiers

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Learning cost-sensitive active classifiers

Artificial Intelligence
Toward Optimal Active Learning through Sampling Estimation of Error Reduction

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
On Active Learning for Data Acquisition

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Active Feature-Value Acquisition for Classifier Induction

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Budgeted learning of nailve-bayes classifiers

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

An Expected Utility Approach to Active Feature-Value Acquisition

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Feature value acquisition in testing: a sequential batch test algorithm

ICML '06 Proceedings of the 23rd international conference on Machine learning
Test Strategies for Cost-Sensitive Decision Trees

IEEE Transactions on Knowledge and Data Engineering
Maximizing classifier utility when training data is costly

ACM SIGKDD Explorations Newsletter
Partial example acquisition in cost-sensitive learning

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Repairing self-confident active-transductive learners using systematic exploration

Pattern Recognition Letters
Maximizing classifier utility when there are data acquisition and modeling costs

Data Mining and Knowledge Discovery
Classification trees and decision-analytic feedforward control: a case study from the video game industry

Data Mining and Knowledge Discovery
Proactive learning: cost-sensitive active learning with multiple imperfect oracles

Proceedings of the 17th ACM conference on Information and knowledge management
Estimating the utility value of individual credit card delinquents

Expert Systems with Applications: An International Journal
On the influence of imputation in classification: practical issues

Journal of Experimental & Theoretical Artificial Intelligence
Principal-agent learning

Decision Support Systems
Cost-sensitive test strategies

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
VOILA: efficient feature-value acquisition for classification

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Active learning with multiple annotations for comparable data classification task

BUCC '11 Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web
Cost-sensitive decision trees applied to medical data

DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Towards anytime active learning: interrupting experts to reduce annotation costs

Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics
Selective sampling and active learning from single and multiple teachers

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many classification tasks training data have missing feature values that can be acquired at a cost. For building accurate predictive models, acquiring all missing values is often prohibitively expensive or unnecessary, while acquiring a random subset of feature values may not be most effective. The goal of active feature-value acquisition is to incrementally select feature values that are most cost-effective for improving the model's accuracy. We present two policies, Sampled Expected Utility and Expected Utility-ES, that acquire feature values for inducing a classification model based on an estimation of the expected improvement in model accuracy per unit cost. A comparison of the two policies to each other and to alternative policies demonstrate that Sampled Expected Utility is preferable as it effectively reduces the cost of producing a model of a desired accuracy and exhibits a consistent performance across domains.