Machine learning for survival analysis: a case study on recurrence of prostate cancer

Authors:
Bla Zupan;Janez DemšAr;Michael W Kattan;J.Robert Beck;I Bratko
Affiliations:
Faculty of Computer and Information Science, University of Ljubljana, Traaška 25, SI-1000 Ljubljana, Slovenia and J. Stefan Institute, Ljubljana, Slovenia and Baylor College of Medicine, Hous ...;Faculty of Computer and Information Science, University of Ljubljana, Traaška 25, SI-1000 Ljubljana, Slovenia;Memorial Sloan Kettering Cancer Center, New York, NY, USA;Baylor College of Medicine, Houston, TX, USA;Faculty of Computer and Information Science, University of Ljubljana, Traaška 25, SI-1000 Ljubljana, Slovenia and J. Stefan Institute, Ljubljana, Slovenia
Venue:
Artificial Intelligence in Medicine
Year:
2000

Citing 6
Cited 11

Learning decision rules in noisy domains

Proceedings of Expert Systems '86, The 6Th Annual Technical Conference on Research and development in expert systems III
Machine learning, neural and statistical classification

Machine learning, neural and statistical classification
Experiments to determine whether recursive partitioning (CART) or an artificial neural network overcomes theoretical limitations of Cox proportional hazards regression

Computers and Biomedical Research
Induction of Decision Trees

Machine Learning
On Estimating Probabilities in Tree Pruning

EWSL '91 Proceedings of the European Working Session on Machine Learning
Machine Learning for Survival Analysis: A Case Study on Recurrence of Prostate Cancer

AIMDM '99 Proceedings of the Joint European Conference on Artificial Intelligence in Medicine and Medical Decision Making

Mining Data from a Knowledge Management Perspective: An Application to Outcome Prediction in Patients with Resectable Hepatocellular Carcinoma

AIME '01 Proceedings of the 8th Conference on AI in Medicine in Europe: Artificial Intelligence Medicine
Improving risk grouping rules for prostate cancer patients using self-organising maps

Design and application of hybrid intelligent systems
Predicting prostate cancer recurrence via maximizing the concordance index

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Applications of machine learning: matching problems to tasks and methods

The Knowledge Engineering Review
Data Mining with Combined Use of Optimization Techniques and Self-Organizing Maps for Improving Risk Grouping Rules: Application to Prostate Cancer Patients

Journal of Management Information Systems
Impact of censoring on learning Bayesian networks in survival modelling

Artificial Intelligence in Medicine
Learning Bayesian networks from survival data using weighting censored instances

Journal of Biomedical Informatics
Uncensoring censored data for machine learning: A likelihood-based approach

Expert Systems with Applications: An International Journal
A combined neural network and decision trees model for prognosis of breast cancer relapse

Artificial Intelligence in Medicine
Wavelet feature extraction and genetic algorithm for biomarker detection in colorectal cancer data

Knowledge-Based Systems
Comprehensible classification models: a position paper

ACM SIGKDD Explorations Newsletter

Quantified Score

Hi-index	0.00

Visualization

Abstract

Machine learning techniques have recently received considerable attention, especially when used for the construction of prediction models from data. Despite their potential advantages over standard statistical methods, like their ability to model non-linear relationships and construct symbolic and interpretable models, their applications to survival analysis are at best rare, primarily because of the difficulty to appropriately handle censored data. In this paper we propose a schema that enables the use of classification methods - including machine learning classifiers - for survival analysis. To appropriately consider the follow-up time and censoring, we propose a technique that, for the patients for which the event did not occur and have short follow-up times, estimates their probability of event and assigns them a distribution of outcome accordingly. Since most machine learning techniques do not deal with outcome distributions, the schema is implemented using weighted examples. To show the utility of the proposed technique, we investigate a particular problem of building prognostic models for prostate cancer recurrence, where the sole prediction of the probability of event (and not its probability dependency on time) is of interest. A case study on preoperative and postoperative prostate cancer recurrence prediction shows that by incorporating this weighting technique the machine learning tools stand beside modern statistical methods and may, by inducing symbolic recurrence models, provide further insight to relationships within the modeled data.