Collaborative clustering with the use of Fuzzy C-Means and its quantification

  • Authors:
  • Witold Pedrycz;Partab Rai

  • Affiliations:
  • System Research Institute, Polish Academy of Sciences, Warsaw, Poland;Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada AB T6R 2G7

  • Venue:
  • Fuzzy Sets and Systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.20

Visualization

Abstract

In this study, we introduce the concept of collaborative fuzzy clustering-a conceptual and algorithmic machinery for the collective discovery of a common structure (relationships) within a finite family of data residing at individual data sites. There are two fundamental features of the proposed optimization environment. First, given existing constraints which prevent individual sites from exchanging detailed numeric data, any communication has to be realized at the level of information granules. The specificity of these granules impacts the effectiveness of ensuing collaborative activities. Second, the fuzzy clustering realized at the level of the individual data site has to constructively consider the findings communicated by other sites and act upon them while running the optimization confined to the particular data site. Adhering to these two general guidelines, we develop a comprehensive optimization scheme and discuss its two-phase character in which the communication phase of the granular findings intertwines with the local optimization being realized at the level of the individual site and exploits the evidence collected from other sites. The proposed augmented form of the objective function is essential in the navigation of the overall optimization that has to be completed on a basis of the data and available information granules. The intensity of collaboration is optimized by choosing a suitable tradeoff between the two components of the objective function. The objective function based clustering used here concerns the well-known Fuzzy C-Means (FCM) algorithm. Experimental studies presented include some synthetic data, selected data sets coming from the machine learning repository and the weather data coming from Environment Canada.