Bias of importance measures for multi-valued attributes and solutions

  • Authors:
  • Houtao Deng;George Runger;Eugene Tuv

  • Affiliations:
  • School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ;School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ;Intel Corporation, Chandler, AZ

  • Venue:
  • ICANN'11 Proceedings of the 21st international conference on Artificial neural networks - Volume Part II
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Attribute importance measures for supervised learning are important for improving both learning accuracy and interpretability. However, it is well-known there could be bias when the predictor attributes have different numbers of values. We propose two methods to solve the bias problem. One uses an out-of-bag sampling method called OOBForest and one, based on the new concept of a partial permutation test, is called pForest. The existing research has considered the bias problem only among irrelevant attributes and equally informative attributes, while we compare to existing methods in a situation where unequally informative attributes (with or without interactions) and irrelevant attributes co-exist. We observe that the existing methods are not always reliable for multi-valued predictors, while the proposed methods compare favorably in our experiments.