Batch-incremental versus instance-incremental learning in dynamic and evolving data

Authors:
Jesse Read;Albert Bifet;Bernhard Pfahringer;Geoff Holmes
Affiliations:
Universidad Carlos III, Madrid, Spain;University of Waikato, Hamilton, New Zealand;University of Waikato, Hamilton, New Zealand;University of Waikato, Hamilton, New Zealand
Venue:
IDA'12 Proceedings of the 11th international conference on Advances in Intelligent Data Analysis
Year:
2012

Citing 21
Cited 1

Very Simple Classification Rules Perform Well on Most Commonly Used Datasets

Machine Learning
On-line learning and stochastic approximations

On-line learning in neural networks
Mining high-speed data streams

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining time-changing data streams

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Experimental comparisons of online and batch versions of bagging and boosting

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A streaming ensemble algorithm (SEA) for large-scale classification

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Mining concept-drifting data streams using ensemble classifiers

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Accurate decision trees for mining high-speed data streams

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
An empirical comparison of supervised learning algorithms

ICML '06 Proceedings of the 23rd international conference on Machine learning
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
Efficient instance-based learning on data streams

Intelligent Data Analysis
New ensemble methods for evolving data streams

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Adaptive Learning from Evolving Data Streams

IDA '09 Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII
Mining Multi-label Concept-Drifting Data Streams Using Dynamic Classifier Ensemble

ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
MOA: Massive Online Analysis

The Journal of Machine Learning Research
Leveraging bagging for evolving data streams

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Estimating continuous distributions in Bayesian classifiers

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Enabling Fast Lazy Learning for Data Streams

ICDM '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining
Fast perceptron decision tree learning from evolving data streams

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Dealing with concept drift and class imbalance in multi-label stream classification

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Scalable and efficient multi-label classification for evolving data streams

Machine Learning

Efficient data stream classification via probabilistic adaptive windows

Proceedings of the 28th Annual ACM Symposium on Applied Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many real world problems involve the challenging context of data streams, where classifiers must be incremental: able to learn from a theoretically-infinite stream of examples using limited time and memory, while being able to predict at any point. Two approaches dominate the literature: batch-incremental methods that gather examples in batches to train models; and instance-incremental methods that learn from each example as it arrives. Typically, papers in the literature choose one of these approaches, but provide insufficient evidence or references to justify their choice. We provide a first in-depth analysis comparing both approaches, including how they adapt to concept drift, and an extensive empirical study to compare several different versions of each approach. Our results reveal the respective advantages and disadvantages of the methods, which we discuss in detail.