Searching for interacting features in subset selection

  • Authors:
  • Zheng Zhao;Huan Liu

  • Affiliations:
  • (Correspd. E-mail: zhaozheng@asu.edu) Department of Computer Science & Engineering, Arizona State University, AZ, USA;Department of Computer Science & Engineering, Arizona State University, AZ, USA

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The evolving and adapting capabilities of robust intelligence are best manifested in its ability to learn. Machine learning enables computer systems to learn, and improve performance. Feature selection facilitates machine learning (e.g., classification) by aiming to remove irrelevant features. Feature (attribute) interaction presents a challenge to feature subset selection for classification. This is because a feature by itself might have little correlation with the target concept, but when it is combined with some other features, they can be strongly correlated with the target concept. Thus, the unintentional removal of these features may result in poor classification performance. It is computationally intractable to handle feature interactions in general. However, the presence of feature interaction in a wide range of real-world applications demands practical solutions that can reduce high-dimensional data while preserving feature interactions. In this paper, we take up the challenge to design a special data structure for feature quality evaluation, and to employ an information-theoretic feature ranking mechanism to efficiently handle feature interaction in subset selection. We conduct experiments to evaluate our approach by comparing with some representative methods, perform a lesion study to examine the critical components of the proposed algorithm to gain insights, and investigate related issues such as data structure, ranking, time complexity, and scalability in search of interacting features.