Yet another approach for completing missing values

  • Authors:
  • Leila Ben Othman;Sadok Ben Yahia

  • Affiliations:
  • Faculty of Sciences of Tunis, Computer Science Department, Campus University, Tunis, Tunisia;Faculty of Sciences of Tunis, Computer Science Department, Campus University, Tunis, Tunisia

  • Venue:
  • CLA'06 Proceedings of the 4th international conference on Concept lattices and their applications
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

When tackling real-life datasets, it is common to face the existence of scrambled missing values within data. Considered as "dirty data", it is usually removed during the pre-processing step of the KDD process. Starting from the fact that "making up this missing data is better than throwing it away", we present a new approach trying to complete the missing data. The main singularity of the introduced approach is that it sheds light on a fruitful synergy between generic basis of association rules and the topic of missing values handling. In fact, beyond interesting compactness rate, such generic association rules make it possible to get a considerable reduction of conflicts during the completion step. A new metric called "Robustness" is also introduced, and aims to select the robust association rule for the completion of a missing value whenever a conflict appears. Carried out experiments on benchmark datasets confirm the soundness of our approach. Thus, it reduces conflict during the completion step while offering a high percentage of correct completion accuracy.