OcVFDT: one-class very fast decision tree for one-class classification of data streams

  • Authors:
  • Chen Li;Yang Zhang;Xue Li

  • Affiliations:
  • Northwest A&F University, Yangling, Shaanxi Province, P.R. China;Northwest A&F University, Yangling, Shaanxi Province, P.R. China;The University of Queensland, Brisbane, Queensland, Australia

  • Venue:
  • Proceedings of the Third International Workshop on Knowledge Discovery from Sensor Data
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Current research on data stream classification mainly focuses on supervised learning, in which a fully labeled data stream is needed for training. However, fully labeled data streams are expensive to obtain, which make the supervised learning approach difficult to be applied to real-life applications. In this paper, we model applications, such as credit fraud detection and intrusion detection, as a one-class data stream classification problem. The cost of fully labeling the data stream is reduced as users only need to provide some positive samples together with the unlabeled samples to the learner. Based on VFDT and POSC4.5, we propose our OcVFDT (One-class Very Fast Decision Tree) algorithm. Experimental study on both synthetic and real-life datasets shows that the OcVFDT has excellent classification performance. Even 80% of the samples in data stream are unlabeled, the classification performance of OcVFDT is still very close to that of VFDT, which is trained on fully labeled stream.