Control-Sensitive Feature Selection for Lazy Learners

  • Authors:
  • Fedro Domingos

  • Affiliations:
  • Department of Information and Computer Science, University of California, Irvine, Irvine, California 92697, U.S.A. E-mail: pedrod@ics.uci.edu

  • Venue:
  • Artificial Intelligence Review - Special issue on lazy learning
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

High sensitivity to irrelevant features is arguably the mainshortcoming of simple lazy learners. In response to it, many featureselection methods have been proposed, including forward sequentialselection (FSS) and backward sequential selection (BSS). Althoughthey often produce substantial improvements in accuracy, these methodsselect the same set of relevant features everywhere in the instancespace, and thus represent only a partial solution to the problem. Ingeneral, some features will be relevant only in some parts of thespace; deleting them may hurt accuracy in those parts, but selectingthem will have the same effect in parts where they are irrelevant.This article introduces RC, a new feature selection algorithm thatuses a clustering-like approach to select sets of locally relevantfeatures (i.e., the features it selects may vary from one instance toanother). Experiments in a large number of domains from the UCIrepository show that RC almost always improves accuracy with respectto FSS and BSS, often with high significance. A study using artificialdomains confirms the hypothesis that this difference in performance isdue to RC‘s context sensitivity, and also suggests conditions wherethis sensitivity will and will not be an advantage. Another featureof RC is that it is faster than FSS and BSS, often by an order ofmagnitude or more.