Diversity exploration and negative correlation learning on imbalanced data sets

Authors:
Shuo Wang;Ke Tang;Xin Yao
Affiliations:
School of Computer Science, The University of Birmingham, Birmingham, UK;Nature Inspired Computation and Applications Laboratory, Department of Computer Science and Technology, University of Science and Technology of China, Hefei, Anhui, China;School of Computer Science, The University of Birmingham, Birmingham, UK
Venue:
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Year:
2009

Citing 13
Cited 1

Bagging predictors

Machine Learning
Ensemble learning via negative correlation

Neural Networks
Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy

Machine Learning
Editorial: special issue on learning from imbalanced data sets

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A study of the behavior of several methods for balancing machine learning training data

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem

IEEE Transactions on Knowledge and Data Engineering
Managing Diversity in Regression Ensembles

The Journal of Machine Learning Research
Classifying imbalanced data using a bagging ensemble variation (BEV)

ACM-SE 45 Proceedings of the 45th annual southeast regional conference
Experimental perspectives on learning from imbalanced data

Proceedings of the 24th international conference on Machine learning
The class imbalance problem: A systematic study

Intelligent Data Analysis
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
A novelty detection approach to classification

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1

Ensemble learning and pruning in multi-objective genetic programming for classification with unbalanced data

AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Class imbalance learning is an important research area in machine learning, where instances in some classes heavily outnumber the instances in other classes. This unbalanced class distribution causes performance degradation. Some ensemble solutions have been proposed for the class imbalance problem. Diversity has been proved to be an influential aspect in ensemble learning, which describes the degree of different decisions made by classifiers. However, none of those proposed solutions explore the impact of diversity on imbalanced data sets. In addition, most of them are based on resampling techniques to rebalance class distribution, and oversampling usually causes overfitting (high generalisation error). This paper investigates if diversity can relieve this problem by using negative correlation learning (NCL) model, which encourages diversity explicitly by adding a penalty term in the error function of neural networks. A variation model of NCL is also proposed - NCLCost. Our study shows that diversity has a direct impact on the measure of recall. It is also a factor that causes the reduction of F-measure. In addition, although NCL-based models with extreme settings do not produce better recall values of minority class than SMOTEBoost [1], they have slightly better performance of F-measure and G-mean than both independent ANNs and SMOTEBoost and better recall than independent ANNs.