Mining rules from an incomplete dataset with a high missing rate

  • Authors:
  • Tzung-Pei Hong;Chih-Wei Wu

  • Affiliations:
  • Department of Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung 811, Taiwan and Department of Computer Science and Engineering, National Sun Yat-sen Univers ...;Department of Electrical Engineering, National University of Kaohsiung, Kaohsiung 811, Taiwan

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2011

Quantified Score

Hi-index 12.05

Visualization

Abstract

The problem of recovering missing values from a dataset has become an important research issue in the field of data mining and machine learning. In this thesis, we introduce an iterative missing-value completion method based on the RAR (Robust Association Rules) support values to extract useful association rules for inferring missing values in an iterative way. It consists of three phases. The first phase uses the association rules to roughly complete the missing values. The second phase iteratively reduces the minimum support to gather more association rules to complete the rest of missing values. The third phase uses the association rules from the completed dataset to correct the missing values that have been filled in. Experimental results show the proposed approaches have good accuracy and data recovery even when the missing-value rate is high.