An efficient set estimator in high dimensions: consistency and applications to fast data visualization

  • Authors:
  • A. Ray Chaudhuri;A. Basu;K. Tan;S. Bhandari;B. B. Chaudhuri

  • Affiliations:
  • Computer Science and Engineering Department, Jadavpur University, Kolkata, West Bengal, India;Indian Statistical Institute, 203 B.T. Road, Kolkata, West Bengal, India;University of Illinois at Urbana Champaign Urbana, IL;Indian Statistical Institute, 203 B.T. Road, Kolkata, West Bengal, India;Indian Statistical Institute, 203 B.T. Road, Kolkata, West Bengal, India

  • Venue:
  • Computer Vision and Image Understanding
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data visualization from a point set by estimating the underlying region is a problem of considerable practical interest and is an associated problem of set estimation. The most important issue in set estimation is consistency. Only a few existing point pattern shape descriptors that estimate the underlying region are consistent set estimators (a set estimator is consistent if it converges--in an appropriate sense--to the original set as the sample size increases). On the other hand, to be used as a shape descriptor, a set estimator should also satisfy several important criteria such as correct identification of number of components, robustness in the presence of noise and computational efficiency. Here we propose such a class of set estimators called s-shapes, which remain consistent in finite dimensions when the data are generated from any continuous distribution. These set estimators can be easily computed and effectively used for fast data visualization. Detailed studies on their performance such as error rates, robustness in presence of noise, run-time analysis, etc., are also performed.