Density-based clustering of uncertain data

  • Authors:
  • Hans-Peter Kriegel;Martin Pfeifle

  • Affiliations:
  • University of Munich, Germany;University of Munich, Germany

  • Venue:
  • Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In many different application areas, e.g. sensor databases, location based services or face recognition systems, distances between odjects have to be computed based on vague and uncertain data. Commonly, the distances between these uncertain object descriptions are expressed by one numerical distance value. Based on such single-valued distance functions standard data mining algorithms can work without any changes. In this paper, we propose to express the similarity between two fuzzy objects by distance probability functions. These fuzzy distance functions assign a probability value to each possible distance value. By integrating these fuzzy distance functions directly into data mining algorithms, the full information provided by these functions is exploited. In order to demonstrate the benefits of this general approach, we enhance the density-based clustering algorithm DBSCAN so that it can work directly on these fuzzy distance functions. In a detailed experimental evaluation based on artificial and real-world data sets, we show the characteristics and benefits of our new approach.