Mining multi-label concept-drifting data streams using ensemble classifiers

Authors:
Qu Wei;Zhang Yang;Zhu Junping;Wang Yong
Affiliations:
College of Information Engineering, Northwest A&F University, Yangling, P.R. China;College of Information Engineering, Northwest A&F University, Yangling, P.R. China;College of Information Engineering, Northwest A&F University, Yangling, P.R. China;School of Computer, Northwest Polytechnical University, Xi'an, P.R. China
Venue:
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 5
Year:
2009

Citing 13
Cited 0

Learning in the presence of concept drift and hidden contexts

Machine Learning
BoosTexter: A Boosting-based Systemfor Text Categorization

Machine Learning - Special issue on information retrieval
Mining time-changing data streams

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A streaming ensemble algorithm (SEA) for large-scale classification

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A maximal figure-of-merit learning approach to text categorization

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A family of additive online algorithms for category ranking

The Journal of Machine Learning Research
Dynamic Weighted Majority: A New Ensemble Method for Tracking Concept Drift

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Mining concept-drifting data streams using ensemble classifiers

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Handling Local Concept Drift with Dynamic Integration of Classifiers: Domain of Antibiotic Resistance in Nosocomial Infections

CBMS '06 Proceedings of the 19th IEEE Symposium on Computer-Based Medical Systems
Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization

IEEE Transactions on Knowledge and Data Engineering
An automatic construction and organization strategy for ensemble learning on data streams

ACM SIGMOD Record
A Unified Model for Multilabel Classification and Ranking

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of mining single-label data streams has been extensively studied in recent years. However, not enough attention has been paid to the problem of mining multilabel data streams. In this paper, a weighted voting ensemble approach is proposed to tackle this problem. We partition the incoming data stream into sequential chunks, and use binary relevance method to transform each chunk into a set of single-label chunks, which could be learned by binary classification algorithm. We train an ensemble of classifiers from the transformed chunks, and the classifiers in the ensemble are weighted based on their expected classification accuracy on the test data under the time-evolving environment. We also proposed a method for simulating multilabel data stream with concept drifting. Our empirical study on synthetic data set shows that the proposed approach has substantial advantage over majority voting ensemble approach.