Feature Selection for Building Cost-Effective Data Stream Classifiers

  • Authors:
  • Like Gao;X. Sean Wang

  • Affiliations:
  • University of Vermont;University of Vermont

  • Venue:
  • ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

A stream classifier is a decision model that assigns a class label to a data stream, based on its arriving data. Various features of the stream can be used in the classifier, each of which may have different relevance to the classification task and different cost in obtaining its value. As time passes by, some less costly features may become more relevant, but the time needed for decision may be considered as a cost. A challenge is how to balance the different costs when building a cost-effective classifier. This paper proposes a new feature selection strategy that extends the traditional Relief algorithm in two aspects: (1) estimate the classification cost associated with each feature, and (2) order all the features with a score that combines both cost estimation and classification relevance. A classifier is then built with the selected features using a traditional classification method. Experimental results show that classifiers constructed with this strategy are indeed cost effective.