Statistical analysis with missing data
Statistical analysis with missing data
Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in intelligent systems: networks of plausible inference
"Missing Is Useful': Missing Values in Cost-Sensitive Decision Trees
IEEE Transactions on Knowledge and Data Engineering
Bayesian networks for imputation in classification problems
Journal of Intelligent Information Systems
Missing Attribute Value Prediction Based on Artificial Neural Network and Rough Set Theory
BMEI '08 Proceedings of the 2008 International Conference on BioMedical Engineering and Informatics - Volume 01
A Bayesian Approach for Estimating and Replacing Missing Categorical Data
Journal of Data and Information Quality (JDIQ)
Estimation of Missing Values Using a Weighted K-Nearest Neighbors Algorithm
ESIAT '09 Proceedings of the 2009 International Conference on Environmental Science and Information Application Technology - Volume 03
Shell-neighbor method and its application in missing data imputation
Applied Intelligence
Missing values estimation in microarray data with partial least squares regression
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II
Hi-index | 0.00 |
Since incompleteness affects the data usage, missing values in database should be estimated to make data mining and analysis more accurate. In addition to ignoring or setting to default values, many imputation methods have been proposed, but all of them have their limitations. This paper proposes a probabilistic method to estimate missing values. We construct a Bayesian network in a novel way to identify the dependencies in a dataset, then use the Bayesian reasoning process to find the most probable substitution for each missing value. The benefits of this method include (1) irrelevant attributes can be ignored during estimation; (2) network is built with no target attribute, which means all attributes are handled in one model;(3) probability information can be obtained to measure the accuracy of the imputation. Experimental results show that our construction algorithm is effective and the quality of filled values outperforms the mode imputation method and kNN method. We also verify the effectiveness of the probabilities given by our method experimentally.