Imputing missing values for mixed numeric and categorical attributes based on incomplete data hierarchical clustering

  • Authors:
  • Xiaodong Feng;Sen Wu;Yanchi Liu

  • Affiliations:
  • School of Economics and Management, University of Science and Technology Beijing, Beijing, P.R. China;School of Economics and Management, University of Science and Technology Beijing, Beijing, P.R. China;School of Economics and Management, University of Science and Technology Beijing, Beijing, P.R. China

  • Venue:
  • KSEM'11 Proceedings of the 5th international conference on Knowledge Science, Engineering and Management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Missing data imputation is a key issue of data pre-processing in data mining field. Though there are many methods for missing value imputation, almost each of these imputation methods has its limitation and is designed for either numeric attributes or categorical attributes. This paper presents IMIC, a new missing value Imputation method for Mixed numeric and categorical attributes based on Incomplete data hierarchical clustering after the introduction of a new concept Incomplete Set Mixed Feature Vector (ISMFV). The effect of the new method is valuated through the comparison experiment using 3 real data sets from UCI.