Machine Learning - Special issue on multistrategy learning
Neural Networks: A Comprehensive Foundation
Neural Networks: A Comprehensive Foundation
Editorial: special issue on learning from imbalanced data sets
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Classification and knowledge discovery in protein databases
Journal of Biomedical Informatics - Special issue: Biomedical machine learning
Introduction to Mathematical Methods in Bioinformatics (Universitext)
Introduction to Mathematical Methods in Bioinformatics (Universitext)
The class imbalance problem: A systematic study
Intelligent Data Analysis
Prediction of Protein Secondary Structure with two-stage multi-class SVMs
International Journal of Data Mining and Bioinformatics
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
Protein secondary structure prediction (PSSP) is one of the main tasks in computational biology. During the last few decades, much effort has been made towards solving this problem, with various approaches, mainly artificial neural networks (ANN). Generally, in order to predict the protein secondary structure, the ANN training process is performed using CB513 data set. Like protein structures databases, this data set is imbalanced and it can cause a low error rate for the majority class and an undesirable error rate for the minority class. In this paper we evaluate the effects of an imbalanced data set in training and learning of neural networks when they are applied to predict protein secondary structure. For this we applied resampling methods to tackle the imbalance class problem. Results show that imbalanced data sets decrease the helixes predictions rates. Although, protein data set distribution does not affect significantly the global accuracy (Q3).