HClustream: A Novel Approach for Clustering Evolving Heterogeneous Data Stream

  • Authors:
  • Chunyu Yang;Jie Zhou

  • Affiliations:
  • Tsinghua University, Beijing, China;Tsinghua University, Beijing, China

  • Venue:
  • ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently, the continuously arriving and evolving data stream has become a common phenomenon in many fields, such as sensor networks, web click stream and internet traffic flow. One of the most important mining tasks is clustering. Clustering has attracted extensive research by both the community of machine learning and data mining. Many stream clustering methods have been proposed. These methods have proven to be efficient on specific problems. However, most of these methods are on continuous clustering and few of them are about to solve the heterogeneous clustering problems. In this paper, we propose a novel approach based on the CluStream framework for clustering data stream with heterogeneous features. The centroid of continuous attributes and the histogram of the discrete attributes are used to represent the Micro clusters, and k-prototype clustering algorithm is used to create the Micro clusters and Macro clusters. Experimental results on both synthetic and real data sets show its efficiency.