To better handle concept change and noise: a cellular automata approach to data stream classification

Authors:
Sattar Hashemi;Ying Yang;Majid Pourkashani;Mohammadreza Kangavari
Affiliations:
Clayton School of Information Technology, Monash University, Australia;Clayton School of Information Technology, Monash University, Australia;Computer Engineering Department, Iran University Of Science and Technology, Tehran, Iran;Computer Engineering Department, Iran University Of Science and Technology, Tehran, Iran
Venue:
AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
Year:
2007

Citing 6
Cited 3

Learning in the presence of concept drift and hidden contexts

Machine Learning
Mining time-changing data streams

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Mining concept-drifting data streams using ensemble classifiers

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Theory and application of cellular automata for pattern classification

Fundamenta Informaticae - Special issue on cellular automata
Effective classification of noisy data streams with attribute-oriented dynamic classifier selection

Knowledge and Information Systems
Learning drifting concepts: Example selection vs. example weighting

Intelligent Data Analysis

Class Specific Fuzzy Decision Trees for Mining High Speed Data Streams

Fundamenta Informaticae
Flexible decision tree for data stream classification in the presence of concept change, noise and missing values

Data Mining and Knowledge Discovery
Class Specific Fuzzy Decision Trees for Mining High Speed Data Streams

Fundamenta Informaticae

Quantified Score

Hi-index	0.00

Visualization

Abstract

A key challenge in data stream classification is to detect changes of the concept underlying the data, and accurately and efficiently adapt classifiers to each concept change. Most existing methods for handling concept changes take a windowing approach, where only recent instances are used to update classifiers while old instances are discarded indiscriminately. However this approach can often be undesirably aggressive because many old instances may not be affected by the concept change and hence can contribute to training the classifier, for instance, reducing the classification variance error caused by insufficient training data. Accordingly this paper proposes a cellular automata (CA) approach that feeds classifiers with most relevant instead of most recent instances. The strength of CA is that it breaks a complicated process down into smaller adaptation tasks, for each a single automaton is responsible. Using neighborhood rules embedded in each automaton and emerging time of instances, this approach assigns a relevance weight to each instance. Instances with high enough weights are selected to update classifiers. Theoretical analyses and experimental results suggest that a good choice of local rules for CA can help considerably speed up updating classifiers corresponding to concept changes, increase classifiers' robustness to noise, and thus offer faster and better classifications for data streams.