Genetic programming for biomarker detection in mass spectrometry data

Authors:
Soha Ahmed;Mengjie Zhang;Lifeng Peng
Affiliations:
School of Engineering and Computer Science, Victoria University of Wellington, Wellington, New Zealand;School of Engineering and Computer Science, Victoria University of Wellington, Wellington, New Zealand;School of Biological Sciences, Victoria University of Wellington, Wellington, New Zealand
Venue:
AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Year:
2012

Citing 8
Cited 0

Standardization and denoising algorithms for mass spectra to classify whole-organism bacterial specimens

Bioinformatics
Peptide detectability following ESI mass spectrometry: prediction using genetic programming

Proceedings of the 9th annual conference on Genetic and evolutionary computation
Comparison of algorithms for pre-processing of SELDI-TOF mass spectrometry data

Bioinformatics
Genetic Programming for Feature Ranking in Classification Problems

SEAL '08 Proceedings of the 7th International Conference on Simulated Evolution and Learning
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Software review: the ECJ toolkit

Genetic Programming and Evolvable Machines
Genetic programming for classification with unbalanced data

EuroGP'10 Proceedings of the 13th European conference on Genetic Programming
Data mining techniques for cancer detection using serum proteomic profiling

Artificial Intelligence in Medicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Classification of mass spectrometry (MS) data is an essential step for biomarker detection which can help in diagnosis and prognosis of diseases. However, due to the high dimensionality and the small sample size, classification of MS data is very challenging. The process of biomarker detection can be referred to as feature selection and classification in terms of machine learning. Genetic programming (GP) has been widely used for classification and feature selection, but it has not been effectively applied to biomarker detection in the MS data. In this study we develop a GP based approach to feature selection, feature extraction and classification of mass spectrometry data for biomarker detection. In this approach, we firstly use GP to reduce the "redundant" features by selecting a small number of important features and constructing high-level features, then we use GP to classify the data based on selected features and constructed features. This approach is examined and compared with three well known machine learning methods namely decision trees, naive Bayes and support vector machines on two biomarker detection data sets. The results show that the proposed GP method can effectively select a small number of important features from thousands of original features for these problems, the constructed high-level features can further improve the classification performance, and the GP method outperforms the three existing methods, namely naive Bayes, SVMs and J48, on these problems.