Machine Learning for the Detection of Oil Spills in Satellite Radar Images
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Minority report in fraud detection: classification of skewed data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A multistrategy approach for digital text categorization from imbalanced documents
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem
IEEE Transactions on Knowledge and Data Engineering
Boosting with data generation: improving the classification of hard to learn examples
IEA/AIE'2004 Proceedings of the 17th international conference on Innovations in applied artificial intelligence
Active learning for class imbalance problem
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Cost-sensitive boosting for classification of imbalanced data
Pattern Recognition
Learning on the border: active learning in imbalanced data classification
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
IEEE Transactions on Knowledge and Data Engineering
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Boosting support vector machines for imbalanced data sets
Knowledge and Information Systems
RAMOBoost: ranked minority oversampling in boosting
IEEE Transactions on Neural Networks
ROC analysis as a useful tool for performance evaluation of artificial neural networks
ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part II
Boosting prediction accuracy on imbalanced datasets with SVM ensembles
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning
ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I
DBSMOTE: Density-Based Synthetic Minority Over-sampling TEchnique
Applied Intelligence
Class imbalance and the curse of minority hubs
Knowledge-Based Systems
Hi-index | 0.10 |
The problem of imbalanced data between classes prevails in various applications such as bioinformatics. The correctness of prediction in case of imbalanced data is usually biased towards the majority class. However, in several applications, the accuracy of prediction in minority class is also significant as much as in majority class. Previously, there were many techniques proposed to increase the accuracy in minority class. These techniques are based on the concept of re-sampling, which can be over-sampling and under-sampling, during the training process. Those re-sampling techniques did not considered how the data are scattered in the space. In this paper, we proposed a new technique based on the fact that the location of separating function in between any two sub-clusters in different classes is defined only by the boundary data of each sub-cluster. In addition, the accuracy is measured only by the testing set. Our technique adapted the concept of bootstrapping to estimate new region of each sub-cluster and synthesize the new boundary data. The new region is for coping with the unseen testing data. All new synthesized data were classified by using the concept of AdaBoost algorithm. Our results outperformed the other techniques under several performance evaluating functions.