Some extensions of score matching
Computational Statistics & Data Analysis
Optimal approximation of signal priors
Neural Computation
International Journal of Computer Vision
Estimating Markov Random Field Potentials for Natural Images
ICA '09 Proceedings of the 8th International Conference on Independent Component Analysis and Signal Separation
Herding dynamical weights to learn
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Learning Deep Architectures for AI
Foundations and Trends® in Machine Learning
Learning Features by Contrasting Natural Images with Noise
ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part II
Gamma Markov random fields for audio source modeling
IEEE Transactions on Audio, Speech, and Language Processing
A two-layer ICA-like model estimated by score matching
ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Interpretation and generalization of score matching
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Herding dynamic weights for partially observed random field models
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
A two-layer model of natural stimuli estimated with score matching
Neural Computation
Phase coupling estimation from multivariate phase statistics
Neural Computation
Least squares estimation without priors or supervision
Neural Computation
A connection between score matching and denoising autoencoders
Neural Computation
Two Distributed-State Models For Generating High-Dimensional Time Series
The Journal of Machine Learning Research
On the expressive power of deep architectures
ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
The Journal of Machine Learning Research
Selecting β-divergence for nonnegative matrix factorization by score matching
ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part II
Deep learning of representations: looking forward
SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
Training energy-based models for time-series imputation
The Journal of Machine Learning Research
Hi-index | 0.00 |
One often wants to estimate statistical models where the probability density function is known only up to a multiplicative normalization constant. Typically, one then has to resort to Markov Chain Monte Carlo methods, or approximations of the normalization constant. Here, we propose that such models can be estimated by minimizing the expected squared distance between the gradient of the log-density given by the model and the gradient of the log-density of the observed data. While the estimation of the gradient of log-density function is, in principle, a very difficult non-parametric problem, we prove a surprising result that gives a simple formula for this objective function. The density function of the observed data does not appear in this formula, which simplifies to a sample average of a sum of some derivatives of the log-density given by the model. The validity of the method is demonstrated on multivariate Gaussian and independent component analysis models, and by estimating an overcomplete filter set for natural image data.