Missing Values: Proposition of a Typology and Characterization with an Association Rule-Based Model

Authors:
Leila Ben Othman;François Rioult;Sadok Ben Yahia;Bruno Crémilleux
Affiliations:
Department of Computer Science, Faculty of Sciences of Tunis, Tunisia and GREYC - CNRS UMR, University of Caen Basse-Normandie, France 6072;GREYC - CNRS UMR, University of Caen Basse-Normandie, France 6072;Department of Computer Science, Faculty of Sciences of Tunis, Tunisia;GREYC - CNRS UMR, University of Caen Basse-Normandie, France 6072
Venue:
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Year:
2009

Citing 13
Cited 0

Statistical analysis with missing data

Statistical analysis with missing data
Approximation of Frequency Queris by Means of Free-Sets

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
A Comparison of Several Approaches to Missing Attribute Values in Data Mining

RSCTC '00 Revised Papers from the Second International Conference on Rough Sets and Current Trends in Computing
Treatment of Missing Values for Association Rules

PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining
Mining Minimal Non-redundant Association Rules Using Frequent Closed Itemsets

CL '00 Proceedings of the First International Conference on Computational Logic
Generating a Condensed Representation for Association Rules

Journal of Intelligent Information Systems
Using Association Rules for Completing Missing Data

HIS '04 Proceedings of the Fourth International Conference on Hybrid Intelligent Systems
The problem of disguised missing data

ACM SIGKDD Explorations Newsletter
Mining itemsets in the presence of missing values

Proceedings of the 2007 ACM symposium on Applied computing
Combined association rules for dealing with missing values

Journal of Information Science
Filling in the Blanks - Krimp Minimisation for Missing Data

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Mining correct properties in incomplete databases

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
CHASE2: rule based chase algorithm for information systems of type λ

AM'03 Proceedings of the Second international conference on Active Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Handling missing values when tackling real-world datasets is a great challenge arousing the interest of many scientific communities. Many works propose completion methods or implement new data mining techniques tolerating the presence of missing values. It turns out that these tasks are very hard. In this paper, we propose a new typology characterizing missing values according to relationships within the data. These relationships are automatically discovered by data mining techniques using generic bases of association rules. We define four types of missing values from these relationships. The characterization is made for each missing value. It differs from the well-known statistical methods which apply a same treatment for all missing values coming from a same attribute. We claim that such a local characterization enables us perceptive techniques to deal with missing values according to their origins: the way in which we deal with the missing values should depend on their origins (e.g., attribute meaningless w.r.t. other attributes, missing values depending on other data, missing values by accident). Experiments on a real-world medical dataset highlight the interests of such a characterization.