A scatter method for data and variable importance evaluation

  • Authors:
  • Martti Juhola;v. Siermala

  • Affiliations:
  • Computer Science, School of Information Sciences, 33014 University of Tampere, Finland;Computer Science, School of Information Sciences, 33014 University of Tampere, Finland

  • Venue:
  • Integrated Computer-Aided Engineering
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We designed an algorithm in order to examine the importance of variables in data sets for variable evaluation and weighting. In particular, it is designated for the evaluation whether a data set includes such information that is useful for the separation of classes in classification and prediction. Such an evaluation can be performed for an entire data set or separately classes or variables. The scatter method is based on traversing through a data set as near neighbour cases and counting class changes, i.e., when the classes of near cases are changed. The fewer the changes, the more compact the classes are in a variable space so that they are possible to separate with high classification accuracy. We tested the method with different data sets of medical origin. Their results showed that the scatter method can be used to explore how separable the classes in these data sets were. This is useful for variable evaluation and weighting.