The nature of statistical learning theory
The nature of statistical learning theory
Robust Classification for Imprecise Environments
Machine Learning
Distributed Data Mining in Credit Card Fraud Detection
IEEE Intelligent Systems
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Less is More: Active Learning with Support Vector Machines
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Query Learning with Large Margin Classifiers
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Support vector machine active learning with applications to text classification
The Journal of Machine Learning Research
A Probabilistic Active Support Vector Learning Algorithm
IEEE Transactions on Pattern Analysis and Machine Intelligence
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Minority report in fraud detection: classification of skewed data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Wrapper-based computation and evaluation of sampling methods for imbalanced datasets
UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Learning concepts from large scale imbalanced data sets using support cluster machines
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Exploratory Under-Sampling for Class-Imbalance Learning
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
A data reduction approach for resolving the imbalanced data issue in functional genomics
Neural Computing and Applications
The class imbalance problem: A systematic study
Intelligent Data Analysis
Learning on the border: active learning in imbalanced data classification
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A stopping criterion for active learning
Computer Speech and Language
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Learning when training data are costly: the effect of class distribution on tree induction
Journal of Artificial Intelligence Research
Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study
IEEE Transactions on Pattern Analysis and Machine Intelligence
EUS SVMs: ensemble of under-sampled SVMs for data imbalance problems
ICONIP'06 Proceedings of the 13 international conference on Neural Information Processing - Volume Part I
The condensed nearest neighbor rule (Corresp.)
IEEE Transactions on Information Theory
Hi-index | 0.00 |
The performance of the support vector machine classification model is prone to the class imbalance problem, which occurs when one class of data severely outnumbers the other class. Traditionally, this issue could be addressed by balancing class distributions with sampling methods. This paper explores and applies the probabilistic active learning StatQSVM Mitra et al., 2004 strategy for yielding balanced class distributions from large scale unbalanced datasets. Rather than querying the instances based on their proximity, StatQSVM selects a set of instances based on locally defined confidence factor with respect to current hyperplane that models the class separation. The explorative study on StatQSVM is carried out using simulated as well as real-world unbalanced datasets. Performance deterioration was observed at high class imbalance settings within the study. To overcome this problem, a fast probabilistic cost weighted undersampling approach, called CStatQSVM with a new stopping criterion is proposed. The experimental results show that the CStatQSVM is successful on minority as well as majority class prediction as compared to LOB, StatQSVM active learning methods and other conventional methods that address class imbalance problem.