Using support vector regression to model the correlation between the clinical metastases time and gene expression profile for breast cancer

  • Authors:
  • Shih-Hau Chiu;Chien-Chi Chen;Thy-Hou Lin

  • Affiliations:
  • Institute of Molecular Medicine & Department of Life Science, National Tsing Hua University, HsinChu, Taiwan and Bioresource Collection and Research Center, Food Industry Research and Development ...;Bioresource Collection and Research Center, Food Industry Research and Development Institute, HsinChu, Taiwan;Institute of Molecular Medicine & Department of Life Science, National Tsing Hua University, HsinChu, Taiwan

  • Venue:
  • Artificial Intelligence in Medicine
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Objective: Recently, the microarray analysis has been an important tool used for studying the cancer type, biological mechanism, and diagnostic biomarkers. There are several machine-learning methods being used to construct the prognostic model based on the microarray data sets. However, most of these previous studies were focused on the supervised classification for predicting the clinical type of patients. In this study, we investigate whether or not the expression level of some significant genes identified can be used to predict the clinical metastases time of patients. Materials and methods: We have used a regression method to remodel the data set of breast cancer published in 2002. Some significant genes were ranked and selected based on a wrapper method with 10-fold cross-validation procedure and the selected genes were used to fit the support vector regression (SVR) model. This method could model the relationship between the significant gene expression value and the clinical metastases time of breast cancer. Results: 44 significant genes are selected for building the regression model and the corresponding cross-validated correlation coefficient obtained is 0.82 which is much superior to those reported previously by others using some different data sets. Moreover, there are two breast cancer related genes (the ligand 14 of the chemokine C-X-C motif (CXCL14) and estrogen receptor gene (ER)) selected in the gene set and one of them is never been included in the other data sets. Conclusion: In this report, we have shown that the expression level of some significant genes identified could strongly correlate with the clinical metastases time of breast cancer patients. The 44 selected genes may be used as a benchmark to evaluate the risk of recurrence of breast cancer.