Isolating top-k dense regions with filtration of sparse background

  • Authors:
  • Ramkishore Bhattacharyya

  • Affiliations:
  • Microsoft India (R&D) Pvt. Ltd., Hyderabad 500 046, India

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2011

Quantified Score

Hi-index 0.10

Visualization

Abstract

The classical notion of clustering is to induce an equivalence class partition on a set of points, each class, being a homogeneous group, is called a cluster. Since it is an equivalence class partition, a point must belong to one and exactly one cluster. However in many applications, data distributions are such that only a subset of the points tends to flock under some distinct clusters while others go random. The present paper introduces an algorithm to find an optimal subset of points (ideally filtering out the random ones) with sufficient grouping tendency. It builds the neighborhood population around every point and picks up top k dense regions with possible reshuffling of points in post-processing. Performance of the algorithm is evaluated with applications onto real and simulated data. Comparative analysis on different quality indices with some other state-of-the-art algorithms establishes effectiveness of the approach.