Random Structures & Algorithms
Average Case Analysis of Algorithms on Sequences
Average Case Analysis of Algorithms on Sequences
On predictive distributions and Bayesian networks
Statistics and Computing
EWCBR '98 Proceedings of the 4th European Workshop on Advances in Case-Based Reasoning
An efficient normalized maximum likelihood algorithm for DNA sequence compression
ACM Transactions on Information Systems (TOIS)
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
A linear-time algorithm for computing the multinomial stochastic complexity
Information Processing Letters
Supervised model-based visualization of high-dimensional data
Intelligent Data Analysis
Information and Complexity in Statistical Modeling
Information and Complexity in Statistical Modeling
A fast normalized maximum likelihood algorithm for multinomial data
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Minimum encoding approaches for predictive modeling
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Fisher information and stochastic complexity
IEEE Transactions on Information Theory
The minimum description length principle in coding and modeling
IEEE Transactions on Information Theory
Asymptotic minimax regret for data compression, gambling, and prediction
IEEE Transactions on Information Theory
Strong optimality of the normalized ML models as universal codes and information in data
IEEE Transactions on Information Theory
Improved spatially adaptive MDL denoising of images using normalized maximum likelihood density
Image and Vision Computing
Fast NML computation for Naive Bayes models
DS'07 Proceedings of the 10th international conference on Discovery science
Hi-index | 0.00 |
Typical problems in bioinformatics involve large discrete datasets. Therefore, in order to apply statistical methods in such domains, it is important to develop efficient algorithms suitable for discrete data. The minimum description length (MDL) principle is a theoretically well-founded, general framework for performing statistical inference. The mathematical formalization of MDL is based on the normalized maximum likelihood (NML) distribution, which has several desirable theoretical properties. In the case of discrete data, straightforward computation of the NML distribution requires exponential time with respect to the sample size, since the definition involves a sum over all the possible data samples of a fixed size. In this paper, we first review some existing algorithms for efficient NML computation in the case of multinomial and naive Bayes model families. Then we proceed by extending these algorithms to more complex, tree-structured Bayesian networks.