C4.5: programs for machine learning
C4.5: programs for machine learning
Robust Classification for Imprecise Environments
Machine Learning
Data Mining Techniques: For Marketing, Sales, and Customer Support
Data Mining Techniques: For Marketing, Sales, and Customer Support
Machine Learning
One-class svms for document classification
The Journal of Machine Learning Research
Mining with rarity: a unifying framework
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Classification and knowledge discovery in protein databases
Journal of Biomedical Informatics - Special issue: Biomedical machine learning
The effect of imbalanced data sets on LDA: A theoretical and empirical analysis
Pattern Recognition
An Evaluation of the Robustness of MTS for Imbalanced Data
IEEE Transactions on Knowledge and Data Engineering
Classification of weld flaws with imbalanced class data
Expert Systems with Applications: An International Journal
An information granulation based data mining approach for classifying imbalanced data
Information Sciences: an International Journal
Hi-index | 0.00 |
Lots of real-world data sets have imbalanced class distributions in which almost all examples belong to one class and far fewer instances belong to others. Compared with the majority examples, the minority examples are usually more interesting class, such as rare diseases in diagnosis data, failures in inspection data, frauds in credit screening data, and so on. A classifier induced from an imbalanced data set has high classification accuracy for the majority class, but an unacceptable error rate for the minority class. This situation is called class imbalance problem and has attracted lots of attentions of researchers in data mining area. To solve this problem, this work proposed a novel method, called Mahalanobis Distance based sampling (MDS) methodology. Experimental results indicated the proposed MDS have a better performance in identifying the minority class compared with traditional techniques, under-sampling, cost-adjusting, and cluster based sampling.