A Comparison of Three Approximation Strategies for Incomplete Data Sets

  • Authors:
  • Jerzy W. Grzymala-Busse;Witold J. Grzymala-Busse;Zdzislaw S. Hippe;Wojciech Rzasa

  • Affiliations:
  • -;-;-;-

  • Venue:
  • GRC '07 Proceedings of the 2007 IEEE International Conference on Granular Computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we consider incomplete data sets, i.e., data sets with missing attribute values. Two different types of missing attribute values are studied: lost and "do not care". Furthermore, three definitions of approximations are dis- cussed: singleton, subset, and concept. Theoretically, sin- gleton approximations should not be used in data mining since concepts approximated by singleton approximations are not definable. However, we conducted a number of experiments on 44 different incomplete data sets using all three approximation definitions and our results show that none of these approximations is superior to the other.