Concept-Learning in the Presence of Between-Class and Within-Class Imbalances

Authors:
Nathalie Japkowicz
Affiliations:
-
Venue:
AI '01 Proceedings of the 14th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence
Year:
2001

Citing 3
Cited 11

Explicitly representing expected cost: an alternative to ROC representation

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Robust Classification for Imprecise Environments

Machine Learning
Concept learning and the problem of small disjuncts

IJCAI'89 Proceedings of the 11th international joint conference on Artificial intelligence - Volume 1

Predicting Software Escalations with Maximum ROI

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Multi-Relational Classification in Imbalanced Domains

ISICA '08 Proceedings of the 3rd International Symposium on Advances in Computation and Intelligence
A Hybrid Approach Handling Imbalanced Datasets

ICIAP '09 Proceedings of the 15th International Conference on Image Analysis and Processing
Improving the performance of Naive Bayes multinomial in e-mail foldering by introducing distribution-based balance of datasets

Expert Systems with Applications: An International Journal
CODE: a data complexity framework for imbalanced datasets

PAKDD'09 Proceedings of the 13th Pacific-Asia international conference on Knowledge discovery and data mining: new frontiers in applied data mining
Customer Validation of Commercial Predictive Models

Proceedings of the 2010 conference on Data Mining for Business Applications
Good seed makes a good crop: accelerating active learning using language modeling

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
ClassySeg: a machine learning approach to automatic stroke segmentation

Proceedings of the Eighth Eurographics Symposium on Sketch-Based Interfaces and Modeling
Cluster-Based sampling approaches to imbalanced data distributions

DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
A combined SMOTE and PSO based RBF classifier for two-class imbalanced problems

Neurocomputing
Technical Section: A machine learning approach to automatic stroke segmentation

Computers and Graphics

Quantified Score

Hi-index	0.00

Visualization

Abstract

In a concept learning problem, imbalances in the distribution of the data can occur either between the two classes or within a single class. Yet, although both types of imbalances are known to affect negatively the performance of standard classifiers, methods for dealing with the class imbalance problem usually focus on rectifying the between-class imbalance problem, neglecting to address the imbalance occuring within each class. The purpose of this paper is to extend the simplest proposed approach for dealing with the between-class imbalance problem--random re-sampling--in order to deal simultaneously with the two problems. Although re-sampling is not necessarily the best way to deal with problems of imbalance, the results reported in this paper suggest that addressing both problems simultaneously is beneficial and should be done by more sophisticated techniques as well.