Peptide detectability following ESI mass spectrometry: prediction using genetic programming
Proceedings of the 9th annual conference on Genetic and evolutionary computation
Genetic Programming for Feature Ranking in Classification Problems
SEAL '08 Proceedings of the 7th International Conference on Simulated Evolution and Learning
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Software review: the ECJ toolkit
Genetic Programming and Evolvable Machines
Genetic programming for classification with unbalanced data
EuroGP'10 Proceedings of the 13th European conference on Genetic Programming
Data mining techniques for cancer detection using serum proteomic profiling
Artificial Intelligence in Medicine
Hi-index | 0.00 |
Classification of mass spectrometry (MS) data is an essential step for biomarker detection which can help in diagnosis and prognosis of diseases. However, due to the high dimensionality and the small sample size, classification of MS data is very challenging. The process of biomarker detection can be referred to as feature selection and classification in terms of machine learning. Genetic programming (GP) has been widely used for classification and feature selection, but it has not been effectively applied to biomarker detection in the MS data. In this study we develop a GP based approach to feature selection, feature extraction and classification of mass spectrometry data for biomarker detection. In this approach, we firstly use GP to reduce the "redundant" features by selecting a small number of important features and constructing high-level features, then we use GP to classify the data based on selected features and constructed features. This approach is examined and compared with three well known machine learning methods namely decision trees, naive Bayes and support vector machines on two biomarker detection data sets. The results show that the proposed GP method can effectively select a small number of important features from thousands of original features for these problems, the constructed high-level features can further improve the classification performance, and the GP method outperforms the three existing methods, namely naive Bayes, SVMs and J48, on these problems.