An efficient ensemble method for classifying skewed data streams

Authors:
Juan Zhang;Xuegang Hu;Yuhong Zhang;Peipei Li
Affiliations:
School of Computer and Information, Hefei University of Technology, Hefei, China;School of Computer and Information, Hefei University of Technology, Hefei, China;School of Computer and Information, Hefei University of Technology, Hefei, China;School of Computer and Information, Hefei University of Technology, Hefei, China
Venue:
ICIC'11 Proceedings of the 7th international conference on Intelligent Computing: bio-inspired computing and applications
Year:
2011

Citing 8
Cited 1

Improving Minority Class Prediction Using Case-Specific Feature Weights

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
AdaCost: Misclassification Cost-Sensitive Boosting

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Editorial: special issue on learning from imbalanced data sets

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A study of the behavior of several methods for balancing machine learning training data

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Classifying Data Streams with Skewed Class Distributions and Concept Drifts

IEEE Internet Computing
Classifying noisy data streams

FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
Generating diverse ensembles to counter the problem of class imbalance

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II

Classifier Ensemble for Imbalanced Data Stream Classification

Proceedings of the CUBE International Information Technology Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Class distributions of data streams in real application are usually unbalanced, they are hence called Skewed Data Streams (abbreviated as SDS). However, in the classification of SDS, it is a challenge for traditional methods because of the difficulty in the recognition of minority classes. Therefore, many approaches have been proposed to improve the recognition rate of minority classes, while they are time-consuming. Motivated by this, we propose an efficient Ensemble method for Classifying SDS called ECSDS. Our algorithm creates multiple classifiers based on C4.5, and adopts the threshold of F1-value to limit the updating frequency of classifiers. Meanwhile, it adds misclassified positive instances into the training data to guarantee the effectiveness of classifiers when updating. Experimental studies demonstrate that our proposed method enables reducing the time overhead and maintains a good performance on the classification accuracy.