Removal and interpolation of missing values using wavelet neural network for heterogeneous data sets

  • Authors:
  • Lipismita Panigrahi;Ruchi Ranjan;Kaberi Das;Debahuti Mishra

  • Affiliations:
  • ITER Siksha O Anusandhan University, Bhubaneswar, Odisha, India;Tata Consultancy Services Bhubaneswar, Odisha, India;ITER, Siksha O Anusandhan University, Bhubaneswar, Odisha, India;ITER Siksha O Anusandhan University, Bhubaneswar, Odisha, India

  • Venue:
  • Proceedings of the International Conference on Advances in Computing, Communications and Informatics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Missing data are common occurrences and can have a significant effect on the conclusions that can be drawn from the data. In statistics, missing data or missing values occur when no data value is stored for the variable in the current observation. Due to missing value we are facing several problems like information loss for computation and analysis of data. Missing values can also cause misleading results by introducing bias. Serious bias is a systematic difference between the observed and the unobserved data. This paper focuses on a methodological framework for the development of an automated data imputation model based on wavelet neural network (WNN). Here we use an adaptive higher order functions or different wavelet functions as the kernel of NN instead of each neuron activation function. A wavelet is a wavelike oscillation with a amplitude that starts out at zero, increases, and then decreases back to zero. Generally, wavelets are purposefully crafted to have specific properties that make them useful for signal processing. Six real, integer and simulated data sets are exposed to a perturbation experiment, based on the random generation of missing values. Here neural network (NN) and WNN is applied in glass identification, wine recognition, heart disease, leukemia, breast cancer and lung cancer data set to find the missing value and compared with different classic imputation procedures. The experiment conducted considering different performance measures using WNN, not only improves the quality of a database with missing value but also the best results are clearly obtained with different variables.