To better handle concept change and noise: a cellular automata approach to data stream classification

  • Authors:
  • Sattar Hashemi;Ying Yang;Majid Pourkashani;Mohammadreza Kangavari

  • Affiliations:
  • Clayton School of Information Technology, Monash University, Australia;Clayton School of Information Technology, Monash University, Australia;Computer Engineering Department, Iran University Of Science and Technology, Tehran, Iran;Computer Engineering Department, Iran University Of Science and Technology, Tehran, Iran

  • Venue:
  • AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

A key challenge in data stream classification is to detect changes of the concept underlying the data, and accurately and efficiently adapt classifiers to each concept change. Most existing methods for handling concept changes take a windowing approach, where only recent instances are used to update classifiers while old instances are discarded indiscriminately. However this approach can often be undesirably aggressive because many old instances may not be affected by the concept change and hence can contribute to training the classifier, for instance, reducing the classification variance error caused by insufficient training data. Accordingly this paper proposes a cellular automata (CA) approach that feeds classifiers with most relevant instead of most recent instances. The strength of CA is that it breaks a complicated process down into smaller adaptation tasks, for each a single automaton is responsible. Using neighborhood rules embedded in each automaton and emerging time of instances, this approach assigns a relevance weight to each instance. Instances with high enough weights are selected to update classifiers. Theoretical analyses and experimental results suggest that a good choice of local rules for CA can help considerably speed up updating classifiers corresponding to concept changes, increase classifiers' robustness to noise, and thus offer faster and better classifications for data streams.