MDS: a novel method for class imbalance learning

Authors:
Long-Sheng Chen;Chun-Chin Hsu;Yu-Shan Chang
Affiliations:
Chaoyang University of Technology, Taichung County, Taiwan;Chaoyang University of Technology, Taichung County, Taiwan;Chaoyang University of Technology, Taichung County, Taiwan
Venue:
Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication
Year:
2009

Citing 14
Cited 0

C4.5: programs for machine learning

C4.5: programs for machine learning
Robust Classification for Imprecise Environments

Machine Learning
Data Mining Techniques: For Marketing, Sales, and Customer Support

Data Mining Techniques: For Marketing, Sales, and Customer Support
Induction of Decision Trees

Machine Learning
One-class svms for document classification

The Journal of Machine Learning Research
Mining with rarity: a unifying framework

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A study of the behavior of several methods for balancing machine learning training data

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Classification and knowledge discovery in protein databases

Journal of Biomedical Informatics - Special issue: Biomedical machine learning
A neural network based information granulation approach to shorten the cellular phone test process

Computers in Industry
The effect of imbalanced data sets on LDA: A theoretical and empirical analysis

Pattern Recognition
An Evaluation of the Robustness of MTS for Imbalanced Data

IEEE Transactions on Knowledge and Data Engineering
Classification of weld flaws with imbalanced class data

Expert Systems with Applications: An International Journal
An information granulation based data mining approach for classifying imbalanced data

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Lots of real-world data sets have imbalanced class distributions in which almost all examples belong to one class and far fewer instances belong to others. Compared with the majority examples, the minority examples are usually more interesting class, such as rare diseases in diagnosis data, failures in inspection data, frauds in credit screening data, and so on. A classifier induced from an imbalanced data set has high classification accuracy for the majority class, but an unacceptable error rate for the minority class. This situation is called class imbalance problem and has attracted lots of attentions of researchers in data mining area. To solve this problem, this work proposed a novel method, called Mahalanobis Distance based sampling (MDS) methodology. Experimental results indicated the proposed MDS have a better performance in identifying the minority class compared with traditional techniques, under-sampling, cost-adjusting, and cluster based sampling.