Impact of censoring on learning Bayesian networks in survival modelling

Authors:
Ivan Štajduhar;Bojana Dalbelo-Bašić;Nikola Bogunović
Affiliations:
Department of Automation, Electronics and Computing, Faculty of Engineering, University of Rijeka, Vukovarska 58, 51000 Rijeka, Croatia;Department of Electronics, Microelectronics, Computer and Intelligent Systems, Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, 10000 Zagreb, Croatia;Department of Electronics, Microelectronics, Computer and Intelligent Systems, Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, 10000 Zagreb, Croatia
Venue:
Artificial Intelligence in Medicine
Year:
2009

Citing 26
Cited 4

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
A Bayesian Method for the Induction of Probabilistic Networks from Data

Machine Learning
An algorithm for deciding if a set of observed independencies has a causal explanation

UAI '92 Proceedings of the eighth conference on Uncertainty in Artificial Intelligence
Machine learning, neural and statistical classification

Machine learning, neural and statistical classification
Learning Bayesian Networks: The Combination of Knowledge and Statistical Data

Machine Learning
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

Machine Learning - Special issue on learning with probabilistic representations
Experiments to determine whether recursive partitioning (CART) or an artificial neural network overcomes theoretical limitations of Cox proportional hazards regression

Computers and Biomedical Research
An application of changepoint methods in studying the effect of age on survival in breast cancer

Computational Statistics & Data Analysis
Causality: models, reasoning, and inference

Causality: models, reasoning, and inference
Graphical Models: Methods for Data Analysis and Mining

Graphical Models: Methods for Data Analysis and Mining
Learning Bayesian networks from data: an information-theory based approach

Artificial Intelligence
Learning Dynamic Bayesian Belief Networks Using Conditional Phase-Type Distributions

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
Optimal structure identification with greedy search

The Journal of Machine Learning Research
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
A simulated annealing-based method for learning Bayesian networks from statistical data: Research Articles

International Journal of Intelligent Systems - Uncertainty Processing
Feature selection in Bayesian classifiers for the prognosis of survival of cirrhotic patients treated with TIPS

Journal of Biomedical Informatics - Special issue: Clinical machine learning
An integrated framework for risk profiling of breast cancer patients following surgery

Artificial Intelligence in Medicine
A Recursive Method for Structural Learning of Directed Acyclic Graphs

The Journal of Machine Learning Research
Using Markov Blankets for Causal Structure Learning

The Journal of Machine Learning Research
Predicting breast cancer survivability: a comparison of three data mining methods

Artificial Intelligence in Medicine
The Bayesian structural EM algorithm

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
A Bayesian neural network approach for modelling censored data with an application to prognosis after surgery for breast cancer

Artificial Intelligence in Medicine
A combined neural network and decision trees model for prognosis of breast cancer relapse

Artificial Intelligence in Medicine
Editorial: Bayesian networks in biomedicine and health-care

Artificial Intelligence in Medicine
Machine learning for survival analysis: a case study on recurrence of prostate cancer

Artificial Intelligence in Medicine

Learning Bayesian networks from survival data using weighting censored instances

Journal of Biomedical Informatics
Uncensoring censored data for machine learning: A likelihood-based approach

Expert Systems with Applications: An International Journal
Accurate Prediction of Coronary Artery Disease Using Reliable Diagnosis System

Journal of Medical Systems
Wavelet feature extraction and genetic algorithm for biomarker detection in colorectal cancer data

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Objective: Bayesian networks are commonly used for presenting uncertainty and covariate interactions in an easily interpretable way. Because of their efficient inference and ability to represent causal relationships, they are an excellent choice for medical decision support systems in diagnosis, treatment, and prognosis. Although good procedures for learning Bayesian networks from data have been defined, their performance in learning from censored survival data has not been widely studied. In this paper, we explore how to use these procedures to learn about possible interactions between prognostic factors and their influence on the variate of interest. We study how censoring affects the probability of learning correct Bayesian network structures. Additionally, we analyse the potential usefulness of the learnt models for predicting the time-independent probability of an event of interest. Methods and materials: We analysed the influence of censoring with a simulation on synthetic data sampled from randomly generated Bayesian networks. We used two well-known methods for learning Bayesian networks from data: a constraint-based method and a score-based method. We compared the performance of each method under different levels of censoring to those of the naive Bayes classifier and the proportional hazards model. We did additional experiments on several datasets from real-world medical domains. The machine-learning methods treated censored cases in the data as event-free. Results: We report and compare results for several commonly used model evaluation metrics. On average, the proportional hazards method outperformed other methods in most censoring setups. As part of the simulation study, we also analysed structural similarities of the learnt networks. Heavy censoring, as opposed to no censoring, produces up to a 5% surplus and up to 10% missing total arcs. It also produces up to 50% missing arcs that should originally be connected to the variate of interest. Conclusion: Presented methods for learning Bayesian networks from data can be used to learn from censored survival data in the presence of light censoring (up to 20%) by treating censored cases as event-free. Given intermediate or heavy censoring, the learnt models become tuned to the majority class and would thus require a different approach.