Bagging support vector machine for classification of SELDI-ToF mass spectra of ovarian cancer serum samples

Authors:
Bailing Zhang;Tuan D. Pham;Yanchun Zhang
Affiliations:
School of Computer Science and Mathematics, Victoria University, VIC, Australia;School of Information Technology, James Cook University, QLD, Australia;School of Computer Science and Mathematics, Victoria University, VIC, Australia
Venue:
AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
Year:
2007

Citing 3
Cited 3

Bagging predictors

Machine Learning
Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum

Bioinformatics
Ovarian cancer identification based on dimensionality reduction for high-throughput mass spectrometry data

Bioinformatics

Is bagging effective in the classification of small-sample genomic and proteomic data?

EURASIP Journal on Bioinformatics and Systems Biology - Special issue on applications of signal procesing techniques to bioinformatics, genomics, and proteomics
Ensemble Approach for the Classification of Imbalanced Data

AI '09 Proceedings of the 22nd Australasian Joint Conference on Advances in Artificial Intelligence
Small-sample error estimation for bagged classification rules

EURASIP Journal on Advances in Signal Processing - Special issue on genomic signal processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

There has been much progresses recently about the identification of diagnostic proteomic signatures for different human cancers using surface-enhanced laser desorption ionization time-of-flight (SELDITOF) mass spectrometry. To identify proteomic patterns in serum to discriminate cancer patients from normal individuals, many classification methods have been experimented, often with successful results. Most of these earlier studies, however, are based on the direct application of original mass spectra, together with dimension reduction methods like PCA or feature selection methods like T-tests. Because only the peaks of MS data correspond to potential biomarkers, it is important to study classification methods using the detected peaks. This paper investigates ovarian cancer identification from the detected MS peaks by applying Bagging Support Vector Machine as a special strategy of bootstrap aggregating (Bagging). In bagging SVM, each individual SVM is trained independently, using randomly chosen training samples via a bootstrap technique. The trained individual SVMs are aggregated to make a collective decision in an appropriate way, for example, the majority voting. Bagged SVM demonstrated a 94% accuracy with 95% sensitivity and 92% specificity respectively by using the detected peaks. The efficiency can be further improved by applying PCA to reduce the dimension.