Classification and Novel Class Detection in Concept-Drifting Data Streams under Time Constraints

Authors:
Mohammad Masud;Jing Gao;Latifur Khan;Jiawei Han;Bhavani M. Thuraisingham
Affiliations:
University of Texas at Dallas, Richardson;University of Illinois at Urbana Champaign , Urbana Urbana;University of Texas at Dallas, Richardson;Univ. of Illinois at Urbana-Champaign, Urbana;The University of Texas at Dallas, Richardson
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2011

Citing 0
Cited 10

Modified blame-based noise reduction for concept drift

AIKED'12 Proceedings of the 11th WSEAS international conference on Artificial Intelligence, Knowledge Engineering and Data Bases
Data stream classification with artificial endocrine system

Applied Intelligence
Automated Anomaly Detector Adaptation using Adaptive Threshold Tuning

ACM Transactions on Information and System Security (TISSEC)
Novelty detection algorithm for data streams multi-class problems

Proceedings of the 28th Annual ACM Symposium on Applied Computing
An adaptive ensemble classifier for mining concept drifting data streams

Expert Systems with Applications: An International Journal
A survey on concept drift adaptation

ACM Computing Surveys (CSUR)
Novel class detection within classification for data streams

ISNN'13 Proceedings of the 10th international conference on Advances in Neural Networks - Volume Part II
Ensemble of online neural networks for non-stationary and imbalanced data streams

Neurocomputing
Dynamic supervised classification method for online monitoring in non-stationary environments

Neurocomputing
Design and Implementation of a Data Mining System for Malware Detection

Journal of Integrated Design & Process Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most existing data stream classification techniques ignore one important aspect of stream data: arrival of a novel class. We address this issue and propose a data stream classification technique that integrates a novel class detection mechanism into traditional classifiers, enabling automatic detection of novel classes before the true labels of the novel class instances arrive. Novel class detection problem becomes more challenging in the presence of concept-drift, when the underlying data distributions evolve in streams. In order to determine whether an instance belongs to a novel class, the classification model sometimes needs to wait for more test instances to discover similarities among those instances. A maximum allowable wait time T_c is imposed as a time constraint to classify a test instance. Furthermore, most existing stream classification approaches assume that the true label of a data point can be accessed immediately after the data point is classified. In reality, a time delay T_l is involved in obtaining the true label of a data point since manual labeling is time consuming. We show how to make fast and correct classification decisions under these constraints and apply them to real benchmark data. Comparison with state-of-the-art stream classification techniques prove the superiority of our approach.