An analysis of missing data treatment methods and their application to health care dataset

  • Authors:
  • Peng Liu;Elia El-Darzi;Lei Lei;Christos Vasilakis;Panagiotis Chountas;Wei Huang

  • Affiliations:
  • School of Information Management and Engineering, Shanghai University of Finance and Economics, Shanghai, P.R. China;Health Care Computing Group, School of Computer Science, University of Westminster, London, Northwick Park, UK;School of Information Management and Engineering, Shanghai University of Finance and Economics, Shanghai, P.R. China;Health Care Computing Group, School of Computer Science, University of Westminster, London, Northwick Park, UK;Health Care Computing Group, School of Computer Science, University of Westminster, London, Northwick Park, UK;Health Care Computing Group, School of Computer Science, University of Westminster, London, Northwick Park, UK

  • Venue:
  • ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
  • Year:
  • 2005

Quantified Score

Hi-index 0.03

Visualization

Abstract

It is well accepted that many real-life datasets are full of missing data. In this paper we introduce, analyze and compare several well known treatment methods for missing data handling and propose new methods based on Naive Bayesian classifier to estimate and replace missing data. We conduct extensive experiments on datasets from UCI to compare these methods. Finally we apply these models to a geriatric hospital dataset in order to assess their effectiveness on a real-life dataset.