Short communication: Algorithm to determine ε-distance parameter in density based clustering

Authors:
Sunita Jahirabadkar;Parag Kulkarni
Affiliations:
-;-
Venue:
Expert Systems with Applications: An International Journal
Year:
2014

Citing 12
Cited 0

OPTICS: ordering points to identify the clustering structure

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques

Data mining: concepts and techniques
Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications

Data Mining and Knowledge Discovery
A Distribution-Based Clustering Algorithm for Mining in Large Spatial Databases

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Subspace Selection for Clustering High-Dimensional Data

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
ST-DBSCAN: An algorithm for clustering spatial-temporal data

Data & Knowledge Engineering
A local-density based spatial clustering algorithm with noise

Information Systems
A clustering-based approach for discovering interesting places in trajectories

Proceedings of the 2008 ACM symposium on Applied computing
A Parameter-Free Clustering Algorithm Based on Density Model

ICYCS '08 Proceedings of the 2008 The 9th International Conference for Young Computer Scientists
INSCY: Indexing Subspace Clusters with In-Process-Removal of Redundancy

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
P-DBSCAN: a density based clustering algorithm for exploration and analysis of attractive areas using collections of geo-tagged photos

Proceedings of the 1st International Conference and Exhibition on Computing for Geospatial Research & Application
On discovering moving clusters in spatio-temporal data

SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases

Quantified Score

Hi-index	12.05

Visualization

Abstract

The well known clustering algorithm DBSCAN is founded on the density notion of clustering. However, the use of global density parameter @e-distance makes DBSCAN not suitable in varying density datasets. Also, guessing the value for the same is not straightforward. In this paper, we generalise this algorithm in two ways. First, adaptively determine the key input parameter @e-distance, which makes DBSCAN independent of domain knowledge satisfying the unsupervised notion of clustering. Second, the approach of deriving @e-distance based on checking the data distribution of each dimension makes the approach suitable for subspace clustering, which detects clusters enclosed in various subspaces of high dimensional data. Experimental results illustrate that our approach can efficiently find out the clusters of varying sizes, shapes as well as varying densities.