An experimental comparison of three rough set approaches to missing attribute values

  • Authors:
  • Jerzy W. Grzymala-Busse;Witold J. Grzymala-Busse

  • Affiliations:
  • Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS and Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland;Touchnet Information Systems, Inc., Lenexa, KS

  • Venue:
  • Transactions on rough sets VI
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present results of experiments conducted to compare three types of missing attribute values: lost values, "do not care" conditions and attribute-concept values. For our experiments we selected six well known data sets. For every data set we created 30 new data sets replacing specified values by three different types of missing attribute values, starting from 10%, ending with 100%, with increment of 10%. For all concepts of every data set concept lower and upper approximations were computed. Error rates were evaluated using ten-fold cross validation. Overall, interpreting missing attribute values as lost provides the best result for most incomplete data sets.