A New Distributed Algorithm for Large Data Clustering

  • Authors:
  • D. K. Bhattacharyya;A. Das

  • Affiliations:
  • -;-

  • Venue:
  • IDEAL '00 Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning, Data Mining, Financial Engineering, and Intelligent Agents
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a new distributed data clustering algorithm, which operates successfully on huge data sets. The algorithm is designed based on a classical clustering algorithm, called PAM [8, 9] and a spanning tree-based clustering algorithm, called Clusterize [3]. It out-performs its counterparts both in clustering quality and execution time. The algorithm also better utilizes the computing resources associated with the clusterization process. The algorithm operates in linear time.