An analysis of missing data treatment methods and their application to health care dataset

Authors:
Peng Liu;Elia El-Darzi;Lei Lei;Christos Vasilakis;Panagiotis Chountas;Wei Huang
Affiliations:
School of Information Management and Engineering, Shanghai University of Finance and Economics, Shanghai, P.R. China;Health Care Computing Group, School of Computer Science, University of Westminster, London, Northwick Park, UK;School of Information Management and Engineering, Shanghai University of Finance and Economics, Shanghai, P.R. China;Health Care Computing Group, School of Computer Science, University of Westminster, London, Northwick Park, UK;Health Care Computing Group, School of Computer Science, University of Westminster, London, Northwick Park, UK;Health Care Computing Group, School of Computer Science, University of Westminster, London, Northwick Park, UK
Venue:
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Year:
2005

Citing 4
Cited 0

Statistical analysis with missing data

Statistical analysis with missing data
C4.5: programs for machine learning

C4.5: programs for machine learning
Data mining: concepts and techniques

Data mining: concepts and techniques
Principles of data mining

Principles of data mining

Quantified Score

Hi-index	0.03

Visualization

Abstract

It is well accepted that many real-life datasets are full of missing data. In this paper we introduce, analyze and compare several well known treatment methods for missing data handling and propose new methods based on Naive Bayesian classifier to estimate and replace missing data. We conduct extensive experiments on datasets from UCI to compare these methods. Finally we apply these models to a geriatric hospital dataset in order to assess their effectiveness on a real-life dataset.