Forest Density Estimation

Authors:
Han Liu;Min Xu;Haijie Gu;Anupam Gupta;John Lafferty;Larry Wasserman
Affiliations:
-;-;-;-;-;-
Venue:
The Journal of Machine Learning Research
Year:
2011

Citing 9
Cited 4

Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Beyond independent components: trees and clusters

The Journal of Machine Learning Research
Approximating minimum bounded degree spanning trees to within one of optimal

Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data

The Journal of Machine Learning Research
Introduction to Nonparametric Estimation

Introduction to Nonparametric Estimation
Efficient algorithm for the partitioning of trees

IBM Journal of Research and Development
The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs

The Journal of Machine Learning Research
Learning Gaussian tree models: analysis of error exponents and extremal structures

IEEE Transactions on Signal Processing
A Large-Deviation Analysis of the Maximum-Likelihood Learning of Markov Tree Structures

IEEE Transactions on Information Theory

Learning High-Dimensional Markov Forest Distributions: Analysis of Error Rates

The Journal of Machine Learning Research
Efficiently approximating Markov tree bagging for high-dimensional density estimation

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
High-dimensional Gaussian graphical model selection: walk summability and local separation criterion

The Journal of Machine Learning Research
A survey on latent tree models and applications

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study graph estimation and density estimation in high dimensions, using a family of density estimators based on forest structured undirected graphical models. For density estimation, we do not assume the true distribution corresponds to a forest; rather, we form kernel density estimates of the bivariate and univariate marginals, and apply Kruskal's algorithm to estimate the optimal forest on held out data. We prove an oracle inequality on the excess risk of the resulting estimator relative to the risk of the best forest. For graph estimation, we consider the problem of estimating forests with restricted tree sizes. We prove that finding a maximum weight spanning forest with restricted tree size is NP-hard, and develop an approximation algorithm for this problem. Viewing the tree size as a complexity parameter, we then select a forest using data splitting, and prove bounds on excess risk and structure selection consistency of the procedure. Experiments with simulated data and microarray data indicate that the methods are a practical alternative to Gaussian graphical models.