PHA: A fast potential-based hierarchical agglomerative clustering method

  • Authors:
  • Yonggang Lu;Yi Wan

  • Affiliations:
  • School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China;School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China

  • Venue:
  • Pattern Recognition
  • Year:
  • 2013

Quantified Score

Hi-index 0.01

Visualization

Abstract

A novel potential-based hierarchical agglomerative (PHA) clustering method is proposed. In this method, we first construct a hypothetical potential field of all the data points, and show that this potential field is closely related to nonparametric estimation of the global probability density function of the data points. Then we propose a new similarity metric incorporating both the potential field which represents global data distribution information and the distance matrix which represents local data distribution information. Finally we develop another equivalent similarity metric based on an edge weighted tree of all the data points, which leads to a fast agglomerative clustering algorithm with time complexity O(N^2). The proposed PHA method is evaluated by comparing with six other typical agglomerative clustering methods on four synthetic data sets and two real data sets. Experiments show that it runs much faster than the other methods and produces the most satisfying results in most cases.