Tracking Drifting Concepts By Minimizing Disagreements
Machine Learning - Special issue on computational learning theory
Learning in the presence of concept drift and hidden contexts
Machine Learning
A streaming ensemble algorithm (SEA) for large-scale classification
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Incremental Learning from Noisy Data
Machine Learning
Dynamic Weighted Majority: A New Ensemble Method for Tracking Concept Drift
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Mining concept-drifting data streams using ensemble classifiers
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
An Adaptive Learning Approach for Noisy Data Streams
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Boosting classifiers for drifting concepts
Intelligent Data Analysis - Knowlegde Discovery from Data Streams
Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts
The Journal of Machine Learning Research
A Low-Granularity Classifier for Data Streams with Concept Drifts and Biased Class Distribution
IEEE Transactions on Knowledge and Data Engineering
Hi-index | 0.00 |
We have recently introduced an incremental learning algorithm, called Learn++.NSE, designed for Non-Stationary Environments (concept drift), where the underlying data distribution changes over time. With each dataset drawn from a new environment, Learn++.NSE generates a new classifier to form an ensemble of classifiers. The ensemble members are combined through a dynamically weighted majority voting, where voting weights are determined based on classifiers' age-adjusted accuracy on current and past environments. Unlike other ensemble-based concept drift algorithms, Learn++.NSE does not discard prior classifiers, allowing potentially cyclical environments to be learned more effectively. While Learn++.NSE has been shown to work well on a variety of concept drift problems, a potential shortcoming of this approach is the cumulative nature of the ensemble size. In this contribution, we expand our analysis of the algorithm to include various ensemble pruning methods to introduce controlled forgetting. Error or age-based pruning methods have been integrated into the algorithm to prevent potential outvoting from irrelevant classifiers or simply to save memory over an extended period of time. Here, we analyze the tradeoff between these precautions and the desire to handle recurring contexts (cyclical data). Comparisons are made using several scenarios that introduce various types of drift.