Indexed-based density biased sampling for clustering applications

  • Authors:
  • Alexandros Nanopoulos;Yannis Theodoridis;Yannis Manolopoulos

  • Affiliations:
  • Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece;Department of Informatics, University of Piraeus, Greece;Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Density biased sampling (DBS) has been proposed to address the limitations of Uniform sampling, by producing the desired probability distribution in the sample. The ease of producing a random sample depends on the available mechanism for accessing the elements of the dataset. Existing DBS algorithms perform sampling over flat files. In this paper, we develop a new method that exploits spatial indexes and the local density information they preserve, to provide good quality of sampling result and fast access to elements of the dataset. With the proposed method accurate density estimations can be produced with respect to factors like skew, noise or dimensionality. Moreover, significant improvement in sampling time is attained. The performance of the proposed method is examined analytically and experimentally. The comparative results illustrate its superiority over existing methods.