Data stream dynamic clustering supported by Markov chain isomorphisms

  • Authors:
  • Marcelo Keese Albertini;Rodrigo Fernandes de Mello

  • Affiliations:
  • Faculty of Computing, Federal University of Uberlandia, Uberlandia, Brazil;Department of Computer Science, Institute of Mathematics and Computer Science, University of Sao Paulo, Sao Carlos, SP, Brazil

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Several research fields have described phenomena that produce endless sequences of samples, referred to as data streams. These phenomena are studied using data clustering models continuously obtained throughout the endless data gathering process, whose set of dynamical properties, i.e., behavior, evolves over time. In order to cope with data streams characteristics, researchers have developed clustering techniques with low time-complexity requirements. However, pre-defined and static parameters thresholds, number of clusters and learning rates commonly found in current techniques still limit the application of clustering to data streams. These limitations to adapt clustering process to behavior changes motivated this paper to propose an on-line and adaptive approach to detect changes and modify parameters. The proposed approach is based on the traditional k-means algorithm to update cluster prototypes and the statistical model of Markov chains to represent behavior. Behavior changes are detected by testing the isomorphism of Markov chains over time under the grounds of Dynamical Systems Theory. The results have confirmed the advantages of the approach when compared with current techniques.