Probability density estimation from optimally condensed data samples

  • Authors:
  • M. Girolami;Chao He

  • Affiliations:
  • Sch. of Inf. & Commun. Technol., Paisley Univ., UK;-

  • Venue:
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • Year:
  • 2003

Quantified Score

Hi-index 0.14

Visualization

Abstract

The requirement to reduce the computational cost of evaluating a point probability density estimate when employing a Parzen window estimator is a well-known problem. This paper presents the Reduced Set Density Estimator that provides a kernel-based density estimator which employs a small percentage of the available data sample and is optimal in the L2 sense. While only requiring 𝒪(N2) optimization routines to estimate the required kernel weighting coefficients, the proposed method provides similar levels of performance accuracy and sparseness of representation as Support Vector Machine density estimation, which requires 𝒪(N3) optimization routines, and which has previously been shown to consistently outperform Gaussian Mixture Models. It is also demonstrated that the proposed density estimator consistently provides superior density estimates for similar levels of data reduction to that provided by the recently proposed Density-Based Multiscale Data Condensation algorithm and, in addition, has comparable computational scaling. The additional advantage of the proposed method is that no extra free parameters are introduced such as regularization, bin width, or condensation ratios, making this method a very simple and straightforward approach to providing a reduced set density estimator with comparable accuracy to that of the full sample Parzen density estimator.