NML computation algorithms for tree-structured multinomial Bayesian networks

Authors:
Petri Kontkanen;Hannes Wettig;Petri Myllymäki
Affiliations:
Complex Systems Computation Group, Helsinki Institute for Information Technology, University of Helsinki, Finland;Complex Systems Computation Group, Helsinki Institute for Information Technology, University of Helsinki, Finland;Complex Systems Computation Group, Helsinki Institute for Information Technology, University of Helsinki, Finland
Venue:
EURASIP Journal on Bioinformatics and Systems Biology
Year:
2007

Citing 15
Cited 2

Sampling contingency tables

Random Structures & Algorithms
Average Case Analysis of Algorithms on Sequences

Average Case Analysis of Algorithms on Sequences
On predictive distributions and Bayesian networks

Statistics and Computing
On Bayesian Case Matching

EWCBR '98 Proceedings of the 4th European Workshop on Advances in Case-Based Reasoning
An efficient normalized maximum likelihood algorithm for DNA sequence compression

ACM Transactions on Information Systems (TOIS)
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)

The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
A linear-time algorithm for computing the multinomial stochastic complexity

Information Processing Letters
Supervised model-based visualization of high-dimensional data

Intelligent Data Analysis
Information and Complexity in Statistical Modeling

Information and Complexity in Statistical Modeling
A fast normalized maximum likelihood algorithm for multinomial data

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Minimum encoding approaches for predictive modeling

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Fisher information and stochastic complexity

IEEE Transactions on Information Theory
The minimum description length principle in coding and modeling

IEEE Transactions on Information Theory
Asymptotic minimax regret for data compression, gambling, and prediction

IEEE Transactions on Information Theory
Strong optimality of the normalized ML models as universal codes and information in data

IEEE Transactions on Information Theory

Improved spatially adaptive MDL denoising of images using normalized maximum likelihood density

Image and Vision Computing
Fast NML computation for Naive Bayes models

DS'07 Proceedings of the 10th international conference on Discovery science

Quantified Score

Hi-index	0.00

Visualization

Abstract

Typical problems in bioinformatics involve large discrete datasets. Therefore, in order to apply statistical methods in such domains, it is important to develop efficient algorithms suitable for discrete data. The minimum description length (MDL) principle is a theoretically well-founded, general framework for performing statistical inference. The mathematical formalization of MDL is based on the normalized maximum likelihood (NML) distribution, which has several desirable theoretical properties. In the case of discrete data, straightforward computation of the NML distribution requires exponential time with respect to the sample size, since the definition involves a sum over all the possible data samples of a fixed size. In this paper, we first review some existing algorithms for efficient NML computation in the case of multinomial and naive Bayes model families. Then we proceed by extending these algorithms to more complex, tree-structured Bayesian networks.