Soft-CsGDT: soft cost-sensitive Gaussian decision tree for cost-sensitive classification of data streams

  • Authors:
  • Ning Guo;Yanhua Yu;Meina Song;Junde Song;Yu Fu

  • Affiliations:
  • Beijing University of Posts and Telecommunications, Beijing, China;Beijing University of Posts and Telecommunications, Beijing, China;Beijing University of Posts and Telecommunications, Beijing, China;Beijing University of Posts and Telecommunications, Beijing, China;Beijing University of Posts and Telecommunications, Beijing, China

  • Venue:
  • Proceedings of the 2nd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Nowadays in many real-world scenarios, high speed data streams are usually with non-uniform misclassification costs and thus call for cost-sensitive classification algorithms of data streams. However, only little literature focuses on this issue. On the other hand, the existing algorithms for cost-sensitive classification can achieve excellent performance in the metric of total misclassification costs, but always lead to obvious reduction of accuracy, which restrains the practical application greatly. In this paper, we present an improved folk theorem. Based on the new theorem, the existing accuracy-based classification algorithm can be converted into soft cost-sensitive one immediately, which allows us to take both accuracy and cost into account. Following the idea of this theorem, the soft-CsGDT algorithm is proposed to process the data streams with non-uniform misclassification costs, which is an expansion of GDT. With both synthetic and real-world datasets, the experimental results show that compared with the cost-sensitive algorithm, the accuracy in our soft-CsGDT is significantly improved, while the total misclassification costs are approximately the same.