Predicting cancer susceptibility from single-nucleotide polymorphism data: a case study in multiple myeloma

  • Authors:
  • Michael Waddell;David Page;John Shaughnessy, Jr.

  • Affiliations:
  • University of Wisconsin, Wisconsin;University of Wisconsin, Wisconsin;University of Arkansas for Medical Sciences Donna D. and Donald M. Lambert, Arkansas

  • Venue:
  • Proceedings of the 5th international workshop on Bioinformatics
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper asks whether susceptibility to early-onset (diagnosis before age 40) of a particularly deadly form of cancer, Multiple Myeloma, can be predicted from single-nucleotide polymorphism (SNP) profiles with an accuracy greater than chance. Specifically, given SNP profiles for 80 Multiple Myeloma patients -- of which we believe 40 to have high susceptibility and 40 to have lower susceptibility -- we train a support vector machine (SVM) to predict age at diagnosis. We chose SVMs for this task because they are well suited to deal with interactions among features and redundant features. The accuracy of the trained SVM estimated by leave-one-out cross-validation is 71%, significantly greater than random guessing. This result is particularly encouraging since only 3000 SNPs were used in profiling, whereas several million SNPs are known.