Mining multi-label concept-drifting data streams using ensemble classifiers

  • Authors:
  • Qu Wei;Zhang Yang;Zhu Junping;Wang Yong

  • Affiliations:
  • College of Information Engineering, Northwest A&F University, Yangling, P.R. China;College of Information Engineering, Northwest A&F University, Yangling, P.R. China;College of Information Engineering, Northwest A&F University, Yangling, P.R. China;School of Computer, Northwest Polytechnical University, Xi'an, P.R. China

  • Venue:
  • FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 5
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of mining single-label data streams has been extensively studied in recent years. However, not enough attention has been paid to the problem of mining multilabel data streams. In this paper, a weighted voting ensemble approach is proposed to tackle this problem. We partition the incoming data stream into sequential chunks, and use binary relevance method to transform each chunk into a set of single-label chunks, which could be learned by binary classification algorithm. We train an ensemble of classifiers from the transformed chunks, and the classifiers in the ensemble are weighted based on their expected classification accuracy on the test data under the time-evolving environment. We also proposed a method for simulating multilabel data stream with concept drifting. Our empirical study on synthetic data set shows that the proposed approach has substantial advantage over majority voting ensemble approach.