Differentiated treatment of missing values in fuzzy clustering

  • Authors:
  • Heiko Timm;Christian Döring;Rudolf Kruse

  • Affiliations:
  • Dept. of Knowledge Processing and Language Engineering, Otto-von-Guericke-University of Magdeburg, Magdeburg, Germany;Dept. of Knowledge Processing and Language Engineering, Otto-von-Guericke-University of Magdeburg, Magdeburg, Germany;Dept. of Knowledge Processing and Language Engineering, Otto-von-Guericke-University of Magdeburg, Magdeburg, Germany

  • Venue:
  • IFSA'03 Proceedings of the 10th international fuzzy systems association World Congress conference on Fuzzy sets and systems
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Partially missing datasets are a prevailing problem in data analysis. Since several reasons for missing attribute values can be distinguished, we suggest a differentiated treatment of this common problem. For datasets, in which feature values are missing completely at random, a variety of approaches has been proposed. In other situations, however, the fact that values are missing provides additional information for the classification of the dataset. Since the known approaches cannot exploit this information, we developed an extension of the Gath and Geva algorithm that can utilize it. We introduce a class specific probability for missing values in order to appropriately assign incomplete data points to clusters. Benchmark datasets are used to demonstrate the capability of the presented approach.