Improved Estimates for the Accuracy of Small Disjuncts
Machine Learning
Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations
IEEE Transactions on Pattern Analysis and Machine Intelligence
Machine Learning for the Detection of Oil Spills in Satellite Radar Images
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Learning and making decisions when costs and probabilities are both unknown
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Variable selection using svm based criteria
The Journal of Machine Learning Research
Editorial: special issue on learning from imbalanced data sets
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Class imbalances versus small disjuncts
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Cost-sensitive boosting for classification of imbalanced data
Pattern Recognition
Cluster-based under-sampling approaches for imbalanced data distributions
Expert Systems with Applications: An International Journal
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
The foundations of cost-sensitive learning
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Learning classifiers from imbalanced data based on biased minimax probability machine
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Boosting prediction accuracy on imbalanced datasets with SVM ensembles
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
The condensed nearest neighbor rule (Corresp.)
IEEE Transactions on Information Theory
Learning SVM with weighted maximum margin criterion for classification of imbalanced data
Mathematical and Computer Modelling: An International Journal
Evaluation of a new hybrid algorithm for highly imbalanced classification problems
International Journal of Hybrid Intelligent Systems
Hi-index | 12.05 |
The aim of computational learning algorithm is to establish grounds that work for any types of data, once and for all. However, majority of the classifiers have their base from balanced datasets. This paper discusses the issues related to imbalanced data distribution problem and the common strategy to deal with imbalance datasets. We propose a model capable of handling imbalance datasets well in which other typical classifiers fail to do so. The model adopted a derivation of support vector machines in selecting variables so that the problem of imbalanced data distribution can be relaxed. Then, we used an Emergent Self-Organizing Map (ESOM) to cluster the ranker features so as to provide clusters for unsupervised classification. This work progresses by examining the efficiency of the model in evaluating imbalanced datasets. Experimental results show that the criterion based on weight vector derivative achieves good results and performs consistently well over imbalance datasets. In general, our approach outperforms other classification methods which are unable to handle the imbalanced data distribution in the testing datasets.