A Fast Algorithm for Density-Based Clustering in Large Database

  • Authors:
  • Bo Zhou;David Wai-Lok Cheung;Ben Kao

  • Affiliations:
  • -;-;-

  • Venue:
  • PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering in large database is an important and useful data mining activity. It expects to find out meaningful patterns among the data set. Some new requirements of clustering have been raised : good efficiency for large database; easy to determine the input parameters; separate noise from the clusters [1]. However, conventional clustering algorithms seldom can fulfill all these requirements. The notion of density-based clustering has been proposed which satisfies all these requirements [1]. In this paper, we present a new and more efficient density-based clustering algorithm called FDC. The clustering in this algorithm is defined by an equivalence relationship on the objects in the database. The complexity of FDC is linear to the size of the database, which is much faster than that of the algorithm DBSCAN proposed in [1]. Extensive performance studies have been carried out on both synthetic and real data which show that FDC is the fastest density-based clustering algorithm proposed as far.