Identification of optimal cluster centroid of multi-variable functions for clustering concept-drift categorical data

  • Authors:
  • K. Reddy Madhavi;A. Vinaya Babu;A. Anand Rao;S. V. N. Raju

  • Affiliations:
  • JNTUA, Ananthpur;JNTUHCE, Hyderabad;JNTUACE, Ananthpur;JNTUH, Hyderabad

  • Venue:
  • Proceedings of the International Conference on Advances in Computing, Communications and Informatics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Identification of useful clusters in large datasets has attracted considerable interest in clustering process. Since data in the World Wide Web is increasing exponentially that affects on clustering accuracy and decision making, change in the concept between every cluster occurs named concept drift. This newly added time based data must be assigned/labeled into generated clusters at our hand. To say that the data labeling was performed well, the clusters must be efficient. Selecting initial cluster center (centroid) is the key factor that has high affection in generating effective clusters. The existing clustering methods selects centroid randomly. Different centroids results in different clusters. To avoid this random selection, we are proposing methods in selecting the centroid by analyzing the properties of data since the data with different properties exists in real world. Our previous work was concentrated in the identification centroid for the functions of single variable and two variable functions. This paper proposes methods in finding optimal cluster centroid for the multi-variable functions and then apply any existing clustering algorithm to generate clusters by using suitable distance measure.