Statistical analysis with missing data
Statistical analysis with missing data
C4.5: programs for machine learning
C4.5: programs for machine learning
Improving Generalization with Active Learning
Machine Learning - Special issue on structured connectionist systems
Selective Sampling Using the Query by Committee Algorithm
Machine Learning
Selective sampling for nearest neighbor classifiers
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Learning cost-sensitive active classifiers
Artificial Intelligence
Toward Optimal Active Learning through Sampling Estimation of Error Reduction
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
On Active Learning for Data Acquisition
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Active Feature-Value Acquisition for Classifier Induction
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Budgeted learning of nailve-bayes classifiers
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
An Expected Utility Approach to Active Feature-Value Acquisition
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Feature value acquisition in testing: a sequential batch test algorithm
ICML '06 Proceedings of the 23rd international conference on Machine learning
Test Strategies for Cost-Sensitive Decision Trees
IEEE Transactions on Knowledge and Data Engineering
Maximizing classifier utility when training data is costly
ACM SIGKDD Explorations Newsletter
Partial example acquisition in cost-sensitive learning
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Repairing self-confident active-transductive learners using systematic exploration
Pattern Recognition Letters
Maximizing classifier utility when there are data acquisition and modeling costs
Data Mining and Knowledge Discovery
Data Mining and Knowledge Discovery
Proactive learning: cost-sensitive active learning with multiple imperfect oracles
Proceedings of the 17th ACM conference on Information and knowledge management
Estimating the utility value of individual credit card delinquents
Expert Systems with Applications: An International Journal
On the influence of imputation in classification: practical issues
Journal of Experimental & Theoretical Artificial Intelligence
Decision Support Systems
Cost-sensitive test strategies
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
VOILA: efficient feature-value acquisition for classification
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Active learning with multiple annotations for comparable data classification task
BUCC '11 Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web
Cost-sensitive decision trees applied to medical data
DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Towards anytime active learning: interrupting experts to reduce annotation costs
Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics
Selective sampling and active learning from single and multiple teachers
The Journal of Machine Learning Research
Hi-index | 0.00 |
In many classification tasks training data have missing feature values that can be acquired at a cost. For building accurate predictive models, acquiring all missing values is often prohibitively expensive or unnecessary, while acquiring a random subset of feature values may not be most effective. The goal of active feature-value acquisition is to incrementally select feature values that are most cost-effective for improving the model's accuracy. We present two policies, Sampled Expected Utility and Expected Utility-ES, that acquire feature values for inducing a classification model based on an estimation of the expected improvement in model accuracy per unit cost. A comparison of the two policies to each other and to alternative policies demonstrate that Sampled Expected Utility is preferable as it effectively reduces the cost of producing a model of a desired accuracy and exhibits a consistent performance across domains.