Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic inference and influence diagrams
Operations Research
Probabilistic reasoning in expert systems: theory and algorithms
Probabilistic reasoning in expert systems: theory and algorithms
Elements of information theory
Elements of information theory
Machine learning, neural and statistical classification
Machine learning, neural and statistical classification
Stochastic Complexity in Statistical Inquiry Theory
Stochastic Complexity in Statistical Inquiry Theory
Introduction to Bayesian Networks
Introduction to Bayesian Networks
Expert Systems and Probabiistic Network Models
Expert Systems and Probabiistic Network Models
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
Building classifiers using Bayesian networks
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Minimum encoding approaches for predictive modeling
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Fisher information and stochastic complexity
IEEE Transactions on Information Theory
Unsupervised Bayesian visualization of high-dimensional data
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
An Unsupervised Bayesian Distance Measure
EWCBR '00 Proceedings of the 5th European Workshop on Advances in Case-Based Reasoning
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
NML computation algorithms for tree-structured multinomial Bayesian networks
EURASIP Journal on Bioinformatics and Systems Biology
A Graphical Model for Content Based Image Suggestion and Feature Selection
PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
A Statistical Approach for Binary Vectors Modeling and Clustering
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
On multivariate binary data clustering and feature weighting
Computational Statistics & Data Analysis
A fast normalized maximum likelihood algorithm for multinomial data
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Integrating spatial and color information in images using a statistical framework
Expert Systems with Applications: An International Journal
Probabilistic graphical models in artificial intelligence
Applied Soft Computing
On supervised selection of Bayesian networks
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
In this paper we are interested in discrete predictionproblems for adecision-theoretic setting, where the task is to compute thepredictive distribution for a finite set of possiblealternatives. This question is first addressed in a general Bayesianframework, where we consider a set of probability distributionsdefined by some parametric model class. Given a prior distribution onthe model parameters and a set of sample data, one possible approachfor determining a predictive distribution is to fix the parameters tothe instantiation with the maximum a posteriori probability. Amore accurate predictive distribution can be obtained by computing theevidence (marginal likelihood), i.e., the integral overall the individual parameter instantiations. As an alternative tothese two approaches, we demonstrate how to use Rissanen's newdefinition of stochastic complexity for determining predictivedistributions, and show how the evidence predictive distribution withJeffrey's prior approaches the new stochastic complexity predictivedistribution in the limit with increasing amount of sample data. Tocompare the alternative approaches in practice, each of the predictivedistributions discussed is instantiated in the Bayesian network modelfamily case. In particular, to determine Jeffrey's prior for this modelfamily, we show how to compute the (expected) Fisher informationmatrix for a fixed but arbitrary Bayesian network structure. In theempirical part of the paper the predictive distributions are comparedby using the simple tree-structured Naive Bayes model, which is usedin the experiments for computational reasons. The experimentationwith several public domain classification datasets suggest that theevidence approach produces the most accurate predictions in thelog-score sense. The evidence-based methods are also quite robust inthe sense that they predict surprisingly well even when only a smallfraction of the full training set is used.