Fast Parzen Window density estimator

Authors:
Xiaoxia Wang;Peter Tiňo;Mark A. Fardal;Somak Raychaudhury;Arif Babul
Affiliations:
School of Computer Science, University of Birmingham, UK;School of Computer Science, University of Birmingham, UK;Dept. of Astronomy, University of Massachusetts;School of Physics and Astronomy, University of Birmingham, UK;Department of Physics and Astronomy, University of Victoria, Canada
Venue:
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Year:
2009

Citing 7
Cited 1

Vector quantization and signal compression

Vector quantization and signal compression
The accuracy and the computational complexity of a multivariate binned kernel density estimator

Journal of Multivariate Analysis
Density-Based Multiscale Data Condensation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast Parzen Density Estimation Using Clustering-Based Branch and Bound

IEEE Transactions on Pattern Analysis and Machine Intelligence
Probability density estimation from optimally condensed data samples

IEEE Transactions on Pattern Analysis and Machine Intelligence
Cluster-based probability model and its application to image and texture processing

IEEE Transactions on Image Processing
A Forward-Constrained Regression Algorithm for Sparse Kernel Density Estimation

IEEE Transactions on Neural Networks

Computational intelligence in astronomy --- a win-win situation

TPNC'12 Proceedings of the First international conference on Theory and Practice of Natural Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Parzen Windows (PW) is a popular non-parametric density estimation technique. In general the smoothing kernel is placed on all available data points, which makes the algorithm computationally expensive when large datasets are considered. Several approaches have been proposed in the past to reduce the computational cost of PW either by subsampling the dataset, or by imposing a sparsity in the density model. Typically the latter requires a rather involved and complex learning process. In this paper, we propose a new simple and efficient kernel-based method for non-parametric probability density function (pdf) estimation on large datasets. We cover the entire data space by a set of fixed radii hyper-balls with densities represented by full covariance Gaussians. The accuracy and efficiency of the new estimator is verified on both synthetic dataset and large datasets of astronomical simulations of the galaxy disruption process. Experiments demonstrate that the estimation accuracy of the new estimator is comparable to that of the previous approaches but with a significant speed-up. We also show that the pdf learnt by the new estimator could used to automatically find the most matching set in large scale astronomical simulations.