Density-based data streams clustering over sliding windows

  • Authors:
  • Jiadong Ren;Ruiqing Ma;Jiadong Ren

  • Affiliations:
  • College of Information Science and Engineering, Yanshan University, Qinhuangdao City, P.R.China;College of Information Science and Engineering, Yanshan University, Qinhuangdao City, P.R.China;School of Computer Science and Technology, Beijing Institute of Technology, Beijing City, P.R.China

  • Venue:
  • FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 5
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data stream clustering is an important task in data stream mining. In this paper, we propose SDStream, a new method for performing density-based data streams clustering over sliding windows. SDStream adopts CluStream clustering framework. In the online component, the potential core-microcluster and outlier micro-cluster structures are introduced to maintain the potential clusters and outliers. They are stored in the form of Exponential Histogram of Cluster Feature (EHCF) in main memory and are maintained by the maintenance of EHCFs. Outdated micro-clusters which need to be deleted are found by the value of t in Temporal Cluster Feature (TCF). In the offline component, the final clusters of arbitrary shape are generated according to all the potential core-micro-clusters maintained online by DBSCAN algorithm. Experimental results show that SDStream which can generate clusters of arbitrary shape has a much higher clustering quality than CluStream which generates spherical clusters.