Estimating attributes: analysis and extensions of RELIEF
ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Genetic Programming for Feature Ranking in Classification Problems
SEAL '08 Proceedings of the 7th International Conference on Simulated Evolution and Learning
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
A Field Guide to Genetic Programming
A Field Guide to Genetic Programming
Software review: the ECJ toolkit
Genetic Programming and Evolvable Machines
Data mining techniques for cancer detection using serum proteomic profiling
Artificial Intelligence in Medicine
Hi-index | 0.00 |
Biomarker discovery using mass spectrometry (MS) data is very useful in disease detection and drug discovery. The process of biomarker discovery in MS data must start with feature selection as the number of features in MS data is extremely large (e.g. thousands) while the number of samples is comparatively small. In this study, we propose the use of genetic programming (GP) for automatic feature selection and classification of MS data. This GP based approach works by using the features selected by two feature selection metrics, namely information gain (IG) and relief-f (REFS-F) in the terminal set. The feature selection performance of the proposed approach is examined and compared with IG and REFS-F alone on five MS data sets with different numbers of features and instances. Naive Bayes (NB), support vector machines (SVMs) and J48 decision trees (J48) are used in the experiments to evaluate the classification accuracy of the selected features. Meanwhile, GP is also used as a classification method in the experiments and its performance is compared with that of NB, SVMs and J48. The results show that GP as a feature selection method can select a smaller number of features with better classification performance than IG and REFS-F using NB, SVMs and J48. In addition, GP as a classification method also outperforms NB and J48 and achieves comparable or slightly better performance than SVMs on these data sets.