SIC-means: a semi-fuzzy approach for clustering data streams using c-means

  • Authors:
  • Amr Magdy;Mahmoud K. Bassiouny

  • Affiliations:
  • Computer and Systems Engineering, Alexandria University, Egypt;Computer and Systems Engineering, Alexandria University, Egypt

  • Venue:
  • ANNPR'10 Proceedings of the 4th IAPR TC3 conference on Artificial Neural Networks in Pattern Recognition
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In recent years, data streaming has gained a significant importance. Advances in both hardware devices and software technologies enable many applications to generate continuous flows of data. This increases the need to develop algorithms that are able to efficiently process data streams. Additionaly, real-time requirements and evolving nature of data streams make stream mining problems, including clustering, challenging research problems. Fuzzy solutions are proposed in the literature for clustering data streams. In this work, we propose a Soft Incremental C-Means variant to enhance the fuzzy approach performance. The experimental evaluation has shown better performance for our approach in terms of Xie-Beni index compared with the pure fuzzy approach with changing different factors that affect the clustering results. In addition, we have conducted a study to analyze the sensitivity of clustering results to the allowed fuzziness level and the size of data history used. This study has shown that different datasets behave differently with changing these factors. Dataset behavior is correlated with the separation between clusters of the dataset.