Lacking Labels in the Stream: Classifying Evolving Stream Data with Few Labels

Authors:
Clay Woolam;Mohammad M. Masud;Latifur Khan
Affiliations:
Department of Computer Science, University of Texas at Dallas,;Department of Computer Science, University of Texas at Dallas,;Department of Computer Science, University of Texas at Dallas,
Venue:
ISMIS '09 Proceedings of the 18th International Symposium on Foundations of Intelligent Systems
Year:
2009

Citing 10
Cited 2

Mining high-speed data streams

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining time-changing data streams

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Mining concept-drifting data streams using ensemble classifiers

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Systematic data selection to mine concept-drifting data streams

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Combining proactive and reactive predictions for data streams

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Using additive expert ensembles to cope with concept drift

ICML '05 Proceedings of the 22nd international conference on Machine learning
A Framework for On-Demand Classification of Evolving Data Streams

IEEE Transactions on Knowledge and Data Engineering
On Appropriate Assumptions to Mine Data Streams: Analysis and Practice

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
A Practical Approach to Classify Evolving Data Streams: Training with Limited Amount of Labeled Data

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Stop Chasing Trends: Discovering High Order Models in Evolving Data

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering

Active learning with evolving streaming data

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Semi-supervised ensemble learning of data streams in the presence of concept drift

HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper outlines a data stream classification technique that addresses the problem of insufficient and biased labeled data. It is practical to assume that only a small fraction of instances in the stream are labeled. A more practical assumption would be that the labeled data may not be independently distributed among all training documents. How can we ensure that a good classification model would be built in these scenarios, considering that the data stream also has evolving nature? In our previous work we applied semi-supervised clustering to build classification models using limited amount of labeled training data. However, it assumed that the data to be labeled should be chosen randomly. In our current work, we relax this assumption, and propose a label propagation framework for data streams that can build good classification models even if the data are not labeled randomly. Comparison with state-of-the-art stream classification techniques on synthetic and benchmark real data proves the effectiveness of our approach.