FRSDE: Fast reduced set density estimator using minimal enclosing ball approximation

  • Authors:
  • Zhaohong Deng;Fu-Lai Chung;Shitong Wang

  • Affiliations:
  • School of Information, Jiangnan University, WuXi 214122, China and Department of Computing, Hong Kong Polytechnic University, Hong Kong, China;Department of Computing, Hong Kong Polytechnic University, Hong Kong, China;School of Information, Jiangnan University, WuXi 214122, China and Department of Computing, Hong Kong Polytechnic University, Hong Kong, China

  • Venue:
  • Pattern Recognition
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Reduced set density estimator (RSDE) is an important technique that can be used to replace the classical Parzen window estimator (PW) for saving the computational cost. Though RSDE demonstrates a nicer performance in the density accuracy and the computational time compared with several existing methods, it still faces the critical challenge for practical applications because of its high time complexity (no less than O(N^2)) and space complexity (O(N^2)) in training the model weighting coefficients on large data sets. In order to overcome this shortcoming, a fast reduced set density estimator algorithm (FRSDE) is proposed in this study. First, the relationship between RSDE and the minimal enclosing ball problems (MEB) in computational geometry is revealed. Then, the finding that RSDE is equivalent to a special MEB problem is derived. With this finding, the fast core-set based MEB approximation algorithm is introduced to develop the proposed algorithm FRSDE. Compared with RSDE, FRSDE has the following distinctive advantage: it can guarantee that the upper bound of the time complexity is linear with the size N of a large data set and the upper bound of the space complexity is independent of N. Our experimental results show that the proposed FRSDE has a competitive performance in the density accuracy and an overwhelming advantage over RSDE for large data sets in the data condensation rate and the training time for the weighting coefficients.