Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners
IEEE Transactions on Pattern Analysis and Machine Intelligence
MetaCost: a general method for making classifiers cost-sensitive
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning and making decisions when costs and probabilities are both unknown
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Complexity Measures of Supervised Classification Problems
IEEE Transactions on Pattern Analysis and Machine Intelligence
Modern Information Retrieval
Cost-Sensitive Learning by Cost-Proportionate Example Weighting
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Mining with rarity: a unifying framework
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Class imbalances versus small disjuncts
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Decision trees with minimal costs
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Using AUC and Accuracy in Evaluating Learning Algorithms
IEEE Transactions on Knowledge and Data Engineering
Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem
IEEE Transactions on Knowledge and Data Engineering
Cost-sensitive boosting for classification of imbalanced data
Pattern Recognition
The class imbalance problem: A systematic study
Intelligent Data Analysis
A memetic algorithm for evolutionary prototype selection: A scaling up approach
Pattern Recognition
Maximizing classifier utility when there are data acquisition and modeling costs
Data Mining and Knowledge Discovery
On the k-NN performance in a challenging scenario of imbalance and overlapping
Pattern Analysis & Applications - Special Issue: Non-parametric distance-based classification techniques and their applications
Evolutionary rule-based systems for imbalanced data sets
Soft Computing - A Fusion of Foundations, Methodologies and Applications - Special Issue on Evolutionary and Metaheuristics based Data Mining (EMBDM); Guest Editors: José A. Gámez, María J. del Jesús, José M. Puerta
Dataset Shift in Machine Learning
Dataset Shift in Machine Learning
A framework for monitoring classifiers’ performance: when and why failure occurs?
Knowledge and Information Systems
International Journal of Approximate Reasoning
IEEE Transactions on Knowledge and Data Engineering
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Learning when training data are costly: the effect of class distribution on tree induction
Journal of Artificial Intelligence Research
Evolutionary undersampling for classification with imbalanced datasets: Proposals and taxonomy
Evolutionary Computation
The foundations of cost-sensitive learning
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Information Sciences: an International Journal
Facetwise analysis of XCS for problems with class imbalances
IEEE Transactions on Evolutionary Computation
Multi-objective genetic fuzzy classifiers for imbalanced and cost-sensitive datasets
Soft Computing - A Fusion of Foundations, Methodologies and Applications
Discriminative Learning Under Covariate Shift
The Journal of Machine Learning Research
Combating the Small Sample Class Imbalance Problem Using Feature Selection
IEEE Transactions on Knowledge and Data Engineering
Learning from imbalanced data in presence of noisy and borderline examples
RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
IEEE Transactions on Evolutionary Computation
Soft Computing - A Fusion of Foundations, Methodologies and Applications - Special Issue on Intelligent Systems, Design and Applications (ISDA 2009)
Information Sciences: an International Journal
Identification of different types of minority class examples in imbalanced data
HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part II
Class imbalance and the curse of minority hubs
Knowledge-Based Systems
Hi-index | 0.00 |
Classifier learning with datasets which suffer from imbalanced class distributions is an important problem in data mining. This issue occurs when the number of examples representing one class is much lower than the ones of the other classes. Its presence in many real-world applications has brought along a growth of attention from researchers. The aim of this work is to shortly review the main issues of this problem and to describe two common approaches for dealing with imbalance, namely sampling and cost sensitive learning. Additionally, we will pay special attention to some open problems, in particular we will carry out a discussion on the data intrinsic characteristics of the imbalanced classification problem which will help to follow new paths that can lead to the improvement of current models, namely size of the dataset, small disjuncts, the overlapping between the classes and the data fracture between training and test distribution.