Exploiting missing clinical data in Bayesian network modeling for predicting medical problems

Authors:
Jau-Huei Lin;Peter J. Haug
Affiliations:
Department of Biomedical Informatics, University of Utah, 26 South 2000 East Room 5775 HSEB, Salt Lake City, UT 84112-5750, USA and Information System, LDS Hospital, Intermountain Healthcare, 8th ...;Department of Biomedical Informatics, University of Utah, 26 South 2000 East Room 5775 HSEB, Salt Lake City, UT 84112-5750, USA and Information System, LDS Hospital, Intermountain Healthcare, 8th ...
Venue:
Journal of Biomedical Informatics
Year:
2008

Citing 7
Cited 10

Randomization tests

Randomization tests
Statistical analysis with missing data

Statistical analysis with missing data
Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Bayesian Networks and Decision Graphs

Bayesian Networks and Decision Graphs
A Guide to the Literature on Learning Probabilistic Networks from Data

IEEE Transactions on Knowledge and Data Engineering
The Bayesian structural EM algorithm

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Uniqueness of medical data mining

Artificial Intelligence in Medicine

Exploiting Data Missingness in Bayesian Network Modeling

IDA '09 Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII
Comparing risks of alternative medical diagnosis using Bayesian arguments

Journal of Biomedical Informatics
Diagnose the mild cognitive impairment by constructing Bayesian network with missing data

Expert Systems with Applications: An International Journal
An automated technique for identifying associations between medications, laboratory results and problems

Journal of Biomedical Informatics
Using intelligence techniques to predict postoperative morbidity of endovascular aneurysm repair

ACIIDS'11 Proceedings of the Third international conference on Intelligent information and database systems - Volume Part I
Risk prediction for postoperative morbidity of endovascular aneurysm repair using ensemble model

ICCSA'10 Proceedings of the 2010 international conference on Computational Science and Its Applications - Volume Part III
Intelligent Postoperative Morbidity Prediction of Heart Disease Using Artificial Intelligence Techniques

Journal of Medical Systems
Accurate Prediction of Coronary Artery Disease Using Reliable Diagnosis System

Journal of Medical Systems
WIMP: Web server tool for missing data imputation

Computer Methods and Programs in Biomedicine
RespiDiag: A Case-Based Reasoning System for the Diagnosis of Chronic Obstructive Pulmonary Disease

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

When machine learning algorithms are applied to data collected during the course of clinical care, it is generally accepted that the data has not been consistently collected. The absence of expected data elements is common and the mechanism through which a data element is missing often involves the clinical relevance of that data element in a specific patient. Therefore, the absence of data may have information value of its own. In the process of designing an application intended to support a medical problem list, we have studied whether the ''missingness'' of clinical data can provide useful information in building prediction models. In this study, we experimented with four methods of treating missing values in a clinical data set-two of them explicitly model the absence or ''missingness'' of data. Each of these data sets were used to build four different kinds of Bayesian classifiers-a naive Bayes structure, a human-composed network structure, and two networks based on structural learning algorithms. We compared the performance between groups with and without explicit models of missingness using the area under the ROC curve. The results showed that in most cases the classifiers trained using the explicit missing value treatments performed better. The result suggests that information may exist in ''missingness'' itself. Thus, when designing a decision support system, we suggest one consider explicitly representing the presence/absence of data in the underlying logic.