A Fast Algorithm for Density-Based Clustering in Large Database

Authors:
Bo Zhou;David Wai-Lok Cheung;Ben Kao
Affiliations:
-;-;-
Venue:
PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
Year:
1999

Citing 5
Cited 4

The SEQUOIA 2000 storage benchmark

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Incremental Clustering for Mining in a Data Warehousing Environment

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases

Density-Based Mining of Quantitative Association Rules

PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Geometric Algorithms for Density-Based Data Clustering

ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
A grid clustering algorithm based on reference and density

ASIAN'05 Proceedings of the 10th Asian Computing Science conference on Advances in computer science: data management on the web
A clustering approach using weighted similarity majority margins

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering in large database is an important and useful data mining activity. It expects to find out meaningful patterns among the data set. Some new requirements of clustering have been raised : good efficiency for large database; easy to determine the input parameters; separate noise from the clusters [1]. However, conventional clustering algorithms seldom can fulfill all these requirements. The notion of density-based clustering has been proposed which satisfies all these requirements [1]. In this paper, we present a new and more efficient density-based clustering algorithm called FDC. The clustering in this algorithm is defined by an equivalence relationship on the objects in the database. The complexity of FDC is linear to the size of the database, which is much faster than that of the algorithm DBSCAN proposed in [1]. Extensive performance studies have been carried out on both synthetic and real data which show that FDC is the fastest density-based clustering algorithm proposed as far.