Predicting cancer susceptibility from single-nucleotide polymorphism data: a case study in multiple myeloma

Authors:
Michael Waddell;David Page;John Shaughnessy, Jr.
Affiliations:
University of Wisconsin, Wisconsin;University of Wisconsin, Wisconsin;University of Arkansas for Medical Sciences Donna D. and Donald M. Lambert, Arkansas
Venue:
Proceedings of the 5th international workshop on Bioinformatics
Year:
2005

Citing 3
Cited 4

Making large-scale support vector machine learning practical

Advances in kernel methods
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
Relation Between Permutation-Test P Values and Classifier Error Estimates

Machine Learning

An optimum random forest model for prediction of genetic susceptibility to complex diseases

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Combinatorial methods for disease association search and susceptibility prediction

WABI'06 Proceedings of the 6th international conference on Algorithms in Bioinformatics
A collective ranking method for genome-wide association studies

Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Towards applying associative classifier for genetic variants

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper asks whether susceptibility to early-onset (diagnosis before age 40) of a particularly deadly form of cancer, Multiple Myeloma, can be predicted from single-nucleotide polymorphism (SNP) profiles with an accuracy greater than chance. Specifically, given SNP profiles for 80 Multiple Myeloma patients -- of which we believe 40 to have high susceptibility and 40 to have lower susceptibility -- we train a support vector machine (SVM) to predict age at diagnosis. We chose SVMs for this task because they are well suited to deal with interactions among features and redundant features. The accuracy of the trained SVM estimated by leave-one-out cross-validation is 71%, significantly greater than random guessing. This result is particularly encouraging since only 3000 SNPs were used in profiling, whereas several million SNPs are known.