Survival Prediction in Lung Cancer Treated with Radiotherapy: Bayesian Networks vs. Support Vector Machines in Handling Missing Data

Authors:
Andre Dekker;Cary Dehing-Oberije;Dirk De Ruysscher;Philippe Lambin;Kartik Komati;Glenn Fung;Shipeng Yu;Andrew Hope;Wilfried De Neve;Yolande Lievens
Affiliations:
-;-;-;-;-;-;-;-;-;-
Venue:
ICMLA '09 Proceedings of the 2009 International Conference on Machine Learning and Applications
Year:
2009

Citing 0
Cited 1

Modeling and predicting the occurrence of brain metastasis from lung cancer by Bayesian network: A case study of Taiwan

Computers in Biology and Medicine

Quantified Score

Hi-index	0.01

Visualization

Abstract

Missing data is a given in the medical domain, so machine learning models should have satisfactory performance even when missing data occurs. Our previous work has focused on support vector machines (SVM), but we hypothesize that Bayesian networks (BN) can handle missing data better. To test the hypothesis, we trained a BN and SVM model for 2 year survival on 322 lung cancer patients and compared their performance in three separate external datasets (35, 47, 33 patients), each with their own characteristics in terms of missing data. The models used tumor size, clinical T and N stage, involved lymph nodes and WHO performance as prognostic features. We found that the BN model performed better than SVM (AUC 0.77, 0.72. 0.70 vs. 0.71, 0.68, 0.69), especially if tumor size was missing. We conclude that BN models are better suited for the medical domain, as they can handle missing data better.