Guest Editors' Introduction: Data Mining in Bioinformatics
IEEE Intelligent Systems
Survival prediction using gene expression data: A review and comparison
Computational Statistics & Data Analysis
Computers in Biology and Medicine
SVM classification to distinguish Parkinson disease patients
Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
A novel approach for effective learning of cluster structures with biological data applications
VDMB'06 Proceedings of the First international conference on Data Mining and Bioinformatics
Brain tumor pathway identification by integrating transcriptome and interactome data
Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Hi-index | 3.84 |
Motivation: Patient outcome prediction using microarray technologies is an important application in bioinformatics. Based on patients' genotypic microarray data, predictions are made to estimate patients' survival time and their risk of tumor metastasis or recurrence. So, accurate prediction can potentially help to provide better treatment for patients. Results: We present a new computational method for patient outcome prediction. In the training phase of this method, we make use of two types of extreme patient samples: short-term survivors who got an unfavorable outcome within a short period and long-term survivors who were maintaining a favorable outcome after a long follow-up time. These extreme training samples yield a clear platform for us to identify relevant genes whose expression is closely related to the outcome. The selected extreme samples and the relevant genes are then integrated by a support vector machine to build a prediction model, by which each validation sample is assigned a risk score that falls into one of the special pre-defined risk groups. We apply this method to several public datasets. In most cases, patients in high and low risk groups stratified by our method have clearly distinguishable outcome status as seen in their Kaplan--Meier curves. We also show that the idea of selecting only extreme patient samples for training is effective for improving the prediction accuracy when different gene selection methods are used. Contact: huiqing@i2r.a-star.edu.sg Supplementary information: http://research.i2r.a-star.edu.sg/huiqing/supplementaldata/survival/survival.html