Active Sampling for Feature Selection

Authors:
Sriharsha Veeramachaneni;Paolo Avesani
Affiliations:
-;-
Venue:
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Year:
2003

Citing 9
Cited 3

Improving Generalization with Active Learning

Machine Learning - Special issue on structured connectionist systems
Selection of relevant features and examples in machine learning

Artificial Intelligence - Special issue on relevance
MetaCost: a general method for making classifiers cost-sensitive

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Toward Optimal Active Learning through Sampling Estimation of Error Reduction

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Support Vector Machine Active Learning with Application sto Text Classification

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
On Active Learning for Data Acquisition

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
An introduction to variable and feature selection

The Journal of Machine Learning Research
Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm

Journal of Artificial Intelligence Research
Active learning for class probability estimation and ranking

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2

AI in support of Plant Disease Management

AI Communications - Binding Environmental Sciences and AI
Maximizing classifier utility when training data is costly

ACM SIGKDD Explorations Newsletter
Maximizing classifier utility when there are data acquisition and modeling costs

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

In knowledge discovery applications, where new featuresare to be added, an acquisition policy can help select thefeatures to be acquired based on their relevance and thecost of extraction. This can be posed as a feature selectionproblem where the feature values are not known in advance.We propose a technique to actively sample the featurevalues with the ultimate goal of choosing between alternativecandidate features with minimum sampling cost.Our heuristic algorithm is based on extracting candidatefeatures in a region of the instance space where the featurevalue is likely to alter our knowledge the most. An experimentalevaluation on a standard database shows that it ispossible outperform a random subsampling policy in termsof the accuracy in feature selection.