Information and Complexity in Statistical Modeling

Authors:
Jorma Rissanen
Affiliations:
-
Venue:
Information and Complexity in Statistical Modeling
Year:
2007

Citing 0
Cited 38

NML computation algorithms for tree-structured multinomial Bayesian networks

EURASIP Journal on Bioinformatics and Systems Biology
Inference of gene regulatory networks based on a universal minimum description length

EURASIP Journal on Bioinformatics and Systems Biology
Artificial General Intelligence through Large-Scale, Multimodal Bayesian Learning

Proceedings of the 2008 conference on Artificial General Intelligence 2008: Proceedings of the First AGI Conference
Structural break estimation of noisy sinusoidal signals

Signal Processing
AR order selection in the case when the model parameters are estimated by forgetting factor least-squares algorithms

Signal Processing
Optimally distinguishable distributions: a new approach to composite hypothesis testing with applications to the classical linear model

IEEE Transactions on Signal Processing
MDL denoising revisited

IEEE Transactions on Signal Processing
Approximation of the two-part MDL code

IEEE Transactions on Information Theory
Universal models for the exponential distribution

IEEE Transactions on Information Theory
Compression-based methods for nonparametric prediction and estimation of some characteristics of time series

IEEE Transactions on Information Theory
Entropy and mutual information can improve fitness evaluation in coevolution of neural networks

CEC'09 Proceedings of the Eleventh conference on Congress on Evolutionary Computation
MML Invariant Linear Regression

AI '09 Proceedings of the 22nd Australasian Joint Conference on Advances in Artificial Intelligence
Model selection by sequentially normalized least squares

Journal of Multivariate Analysis
Fast NML computation for Naive Bayes models

DS'07 Proceedings of the 10th international conference on Discovery science
Learning locally minimax optimal Bayesian networks

International Journal of Approximate Reasoning
Variance-component based sparse signal reconstruction and model selection

IEEE Transactions on Signal Processing
Information distance based fitness and diversity metrics

Proceedings of the 12th annual conference companion on Genetic and evolutionary computation
A signal processing view on packet sampling and anomaly detection

INFOCOM'10 Proceedings of the 29th conference on Information communications
Selection of statistical thresholds in graphical models

EURASIP Journal on Bioinformatics and Systems Biology
Effective complexity and its relation to logical depth

IEEE Transactions on Information Theory
Comparison of some classification algorithms based on deterministic and nondeterministic decision rules

Transactions on rough sets XII
Gauging the value of good data: Informational embodiment quantification

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
ITCH: information-theoretic cluster hierarchies

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Gaussian clusters and noise: an approach based on the minimum description length principle

DS'10 Proceedings of the 13th international conference on Discovery science
Scalable clustering of news search results

Proceedings of the fourth ACM international conference on Web search and data mining
Review: Variable selection in linear regression: Several approaches based on normalized maximum likelihood

Signal Processing
INCONCO: interpretable clustering of numerical and categorical objects

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Information, Divergence and Risk for Binary Experiments

The Journal of Machine Learning Research
Real-time change-point detection using sequentially discounting normalized maximum likelihood coding

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Gaze estimation using regression analysis and AAMs parameters selected based on information criterion

ACCV'10 Proceedings of the 2010 international conference on Computer vision - Volume Part I
Game-theoretic probability combination with applications to resolving conflicts between statistical methods

International Journal of Approximate Reasoning
Dependency clustering across measurement scales

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Summarization-based mining bipartite graphs

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Description length and dimensionality reduction in functional data analysis

Computational Statistics & Data Analysis
Summarizing categorical data by clustering attributes

Data Mining and Knowledge Discovery
2013 Special Issue: Methods for pattern selection, class-specific feature selection and classification for automated learning

Neural Networks
Application of optimally distinguishable distributions to the detection of subspace signals in Gaussian noise of unknown level

Digital Signal Processing
Project dynamics and emergent complexity

Computational & Mathematical Organization Theory

Quantified Score

Hi-index	0.25

Visualization

Abstract

No statistical model is "true" or "false," "right" or "wrong"; the models just have varying performance, which can be assessed. The main theme in this book is to teach modeling based on the principle that the objective is to extract the information from data that can be learned with suggested classes of probability models. The intuitive and fundamental concepts of complexity, learnable information, and noise are formalized, which provides a firm information theoretic foundation for statistical modeling. Inspired by Kolmogorov's structure function in the algorithmic theory of complexity, this is accomplished by finding the shortest code length, called the stochastic complexity, with which the data can be encoded when advantage is taken of the models in a suggested class, which amounts to the MDL (Minimum Description Length) principle. The complexity, in turn, breaks up into the shortest code length for the optimal model in a set of models that can be optimally distinguished from the given data and the rest, which defines "noise" as the incompressible part in the data without useful information. Such a view of the modeling problem permits a unified treatment of any type of parameters, their number, and even their structure. Since only optimally distinguished models are worthy of testing, we get a logically sound and straightforward treatment of hypothesis testing, in which for the first time the confidence in the test result can be assessed. Although the prerequisites include only basic probability calculus and statistics, a moderate level of mathematical proficiency would be beneficial. The different and logically unassailable view of statistical modelling should provide excellent grounds for further research and suggest topics for graduate students in all fields of modern engineering, including and not restricted to signal and image processing, bioinformatics, pattern recognition, and machine learning to mention just a few.