One-Class Classification of Text Streams with Concept Drift

Authors:
Yang Zhang;Xue Li;Maria Orlowska
Affiliations:
-;-;-
Venue:
ICDMW '08 Proceedings of the 2008 IEEE International Conference on Data Mining Workshops
Year:
2008

Citing 0
Cited 4

OcVFDT: one-class very fast decision tree for one-class classification of data streams

Proceedings of the Third International Workshop on Knowledge Discovery from Sensor Data
Editorial: Classifying text streams by keywords using classifier ensemble

Data & Knowledge Engineering
Bayesian classifiers for positive unlabeled learning

WAIM'11 Proceedings of the 12th international conference on Web-age information management
Learning from data streams with only positive and unlabeled data

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Research on streaming data classification has been mostly based on the assumption that data can be fully labelled. However, this is impractical. Firstly it is impossible to make a complete labelling before all data has arrived. Secondly it is generally very expensive to obtain fully labelled data by using man power. Thirdly user interests may change with time so the labels issued earlier may be inconsistent with the labels issued later – this represents concept drift. In this paper, we consider the problem of one-class classification on text stream with respect to concept drift where a large volume of documents arrives at a high speed and with change of user interests and data distribution. In this case, only a small number of positively labelled documents is available for training. We propose a stacking style ensemble-based approach and have compared it to all other window-based approaches, such as single window, fixed window, and full memory approaches. Our experiment results demonstrate that the proposed ensemble approach outperforms all other approaches.