Efficient Density-Based Clustering of Complex Objects

  • Authors:
  • Stefan Brecheisen;Hans-Peter Kriegel;Martin Pfeifle

  • Affiliations:
  • University of Munich, Germany;University of Munich, Germany;University of Munich, Germany

  • Venue:
  • ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Nowadays data mining in large databases of complex objects from scientific, engineering or multimedia applications is getting more and more important. In many different application domains complex object representations along with complex distance functions are used for measuring the similarity between objects. Often not only these complex distance measures are available but also simpler distance functions which can be computed much more efficiently. Traditionally, the well known concept of multi-step query processing which is based on exact and lower-bounding approximative distance functions is used independently of data mining algorithms. In this paper, we will demonstrate how the paradigm of multi-step query processing can be integrated into the two density-based clustering algorithms DBSCAN and OPTICS resulting in a considerable efficiency boost. Our approach tries to confine itself to 驴-range queries on the simple distance functions and carries out complex distance computations only at that stage of the clustering algorithm where they are compulsory to compute the correct clustering result. In a broad experimental evaluation based on real-world test data sets, we demonstrate that our approach accelerates the generation of flat and hierarchical density-based clusterings by more than one order of magnitude.