A sequential algorithm for training text classifiers
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Improving Generalization with Active Learning
Machine Learning - Special issue on structured connectionist systems
Some label efficient learning results
COLT '97 Proceedings of the tenth annual conference on Computational learning theory
Machine Learning - Special issue on context sensitivity and concept drift
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Relevant Data Expansion for Learning Concept Drift from Sparsely Labeled Data
IEEE Transactions on Knowledge and Data Engineering
Worst-Case Analysis of Selective Sampling for Linear Classification
The Journal of Machine Learning Research
An active learning system for mining time-changing data streams
Intelligent Data Analysis
Active Learning from Data Streams
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Lacking Labels in the Stream: Classifying Evolving Stream Data with Few Labels
ISMIS '09 Proceedings of the 18th International Symposium on Foundations of Intelligent Systems
Mining Data Streams with Labeled and Unlabeled Training Examples
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
The Journal of Machine Learning Research
Mining concept-drifting data streams containing labeled and unlabeled instances
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I
Learning model trees from evolving data streams
Data Mining and Knowledge Discovery
Classification and novel class detection in data streams with active mining
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Next challenges for adaptive learning systems
ACM SIGKDD Explorations Newsletter
Feedback-driven multiclass active learning for data streams
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
In learning to classify streaming data, obtaining the true labels may require major effort and may incur excessive cost. Active learning focuses on learning an accurate model with as few labels as possible. Streaming data poses additional challenges for active learning, since the data distribution may change over time (concept drift) and classifiers need to adapt. Conventional active learning strategies concentrate on querying the most uncertain instances, which are typically concentrated around the decision boundary. If changes do not occur close to the boundary, they will be missed and classifiers will fail to adapt. In this paper we develop two active learning strategies for streaming data that explicitly handle concept drift. They are based on uncertainty, dynamic allocation of labeling efforts over time and randomization of the search space. We empirically demonstrate that these strategies react well to changes that can occur anywhere in the instance space and unexpectedly.