Learning Bayesian network classifiers by maximizing conditional likelihood

Authors:
Daniel Grossman;Pedro Domingos
Affiliations:
University of Washington, Seattle, WA;University of Washington, Seattle, WA
Venue:
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Year:
2004

Citing 16
Cited 58

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
A Bayesian Method for the Induction of Probabilistic Networks from Data

Machine Learning
Numerical recipes in C (2nd ed.): the art of scientific computing

Numerical recipes in C (2nd ed.): the art of scientific computing
The nature of statistical learning theory

The nature of statistical learning theory
Learning Bayesian Networks: The Combination of Knowledge and Statistical Data

Machine Learning
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

Machine Learning - Special issue on learning with probabilistic representations
Bayesian Network Classifiers

Machine Learning - Special issue on learning with probabilistic representations
Efficient Approximations for the MarginalLikelihood of Bayesian Networks with Hidden Variables

Machine Learning - Special issue on learning with probabilistic representations
A tutorial on learning with Bayesian networks

Learning in graphical models
Maximum conditional likelihood via bound maximization and the CEM algorithm

Proceedings of the 1998 conference on Advances in neural information processing systems II
Robust Classification for Imprecise Environments

Machine Learning
On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality

Data Mining and Knowledge Discovery
Model Selection Criteria for Learning Belief Nets: An Empirical Comparison

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Structural extension to logistic regression: discriminative parameter learning of belief net classifiers

Eighteenth national conference on Artificial intelligence
Tree Induction for Probability-Based Ranking

Machine Learning

Structural Extension to Logistic Regression: Discriminative Parameter Learning of Belief Net Classifiers

Machine Learning
On Discriminative Bayesian Network Classifiers and Logistic Regression

Machine Learning
Learning class-discriminative dynamic Bayesian networks

ICML '05 Proceedings of the 22nd international conference on Machine learning
Efficient discriminative learning of Bayesian network classifier via boosted augmented naive Bayes

ICML '05 Proceedings of the 22nd international conference on Machine learning
Discriminative versus generative parameter and structure learning of Bayesian network classifiers

ICML '05 Proceedings of the 22nd international conference on Machine learning
Augmenting naive Bayes for ranking

ICML '05 Proceedings of the 22nd international conference on Machine learning
Discriminatively Trained Markov Model for Sequence Classification

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Selection of Generative Models in Classification

IEEE Transactions on Pattern Analysis and Machine Intelligence
Classification using Hierarchical Naïve Bayes models

Machine Learning
Integrating Naïve Bayes and FOIL

The Journal of Machine Learning Research
Towards efficient variables ordering for Bayesian networks classifier

Data & Knowledge Engineering
Discriminative learning of Bayesian network classifiers

AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
Prognostic Bayesian networks

Journal of Biomedical Informatics
Discriminative parameter learning for Bayesian networks

Proceedings of the 25th international conference on Machine learning
Boosted Bayesian network classifiers

Machine Learning
Survey of Improving Naive Bayes for Classification

ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Shrinkage Estimator for Bayesian Network Parameters

ECML '07 Proceedings of the 18th European conference on Machine Learning
Discriminative Structure Learning of Markov Logic Networks

ILP '08 Proceedings of the 18th international conference on Inductive Logic Programming
A Discriminative Learning Method of TAN Classifier

ECSQARU '07 Proceedings of the 9th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Discriminative vs. Generative Learning of Bayesian Network Classifiers

ECSQARU '07 Proceedings of the 9th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Broad phonetic classification using discriminative Bayesian networks

Speech Communication
Latent classification models for binary data

Pattern Recognition
Competitive generative models with structure learning for NLP classification tasks

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
On the classification performance of TAN and general Bayesian networks

Knowledge-Based Systems
Naïve possibilistic network classifiers

Fuzzy Sets and Systems
Discriminative model selection for belief net structures

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
nFOIL: integrating Naïve Bayes and FOIL

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Histogram distance-based Bayesian Network structure learning: A supervised classification specific approach

Decision Support Systems
A conditional independence algorithm for learning undirected graphical models

Journal of Computer and System Sciences
K-Distributions: A New Algorithm for Clustering Categorical Data

ICIC '07 Proceedings of the 3rd International Conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence
Regularized margin-based conditional log-likelihood loss for prototype learning

Pattern Recognition
Bayesian Network Structure Learning by Recursive Autonomy Identification

The Journal of Machine Learning Research
Learning locally weighted C4.4 for class probability estimation

DS'07 Proceedings of the 10th international conference on Discovery science
Efficient learning of Bayesian network classifiers: an extension to the TAN classifier

AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
Feature selection for Bayesian network classifiers using the MDL-FS score

International Journal of Approximate Reasoning
Efficient Heuristics for Discriminative Structure Learning of Bayesian Network Classifiers

The Journal of Machine Learning Research
Learning graphical models for hypothesis testing and classification

IEEE Transactions on Signal Processing
Random one-dependence estimators

Pattern Recognition Letters
From Bayesian classifiers to possibilistic classifiers for numerical data

SUM'10 Proceedings of the 4th international conference on Scalable uncertainty management
A comparative analysis of methods for probability estimation tree

WSEAS Transactions on Computers
Discriminative Learning of Bayesian Networks via Factorized Conditional Log-Likelihood

The Journal of Machine Learning Research
Smooth receiver operating characteristics (smROC) curves

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Bayesian learning of markov network structure

ECML'06 Proceedings of the 17th European conference on Machine Learning
Improving Tree augmented Naive Bayes for class probability estimation

Knowledge-Based Systems
Learning Bayesian network classifiers by risk minimization

International Journal of Approximate Reasoning
PFORTE: revising probabilistic FOL theories

IBERAMIA-SBIA'06 Proceedings of the 2nd international joint conference, and Proceedings of the 10th Ibero-American Conference on AI 18th Brazilian conference on Advances in Artificial Intelligence
Sampling of virtual examples to improve classification accuracy for nominal attribute data

RSCTC'06 Proceedings of the 5th international conference on Rough Sets and Current Trends in Computing
Robust bayesian linear classifier ensembles

ECML'05 Proceedings of the 16th European conference on Machine Learning
Probabilistic first-order theory revision from examples

ILP'05 Proceedings of the 15th international conference on Inductive Logic Programming
Learning naive bayes for probability estimation by feature selection

AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Conditional likelihood maximisation: a unifying framework for information theoretic feature selection

The Journal of Machine Learning Research
Not so greedy: Randomly Selected Naive Bayes

Expert Systems with Applications: An International Journal
An optimization-based approach for the design of Bayesian networks

Mathematical and Computer Modelling: An International Journal
Learning attentive fusion of multiple bayesian network classifiers

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part III
Bandit-based structure learning for bayesian network classifiers

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part II
Alleviating naive Bayes attribute independence assumption by attribute weighting

The Journal of Machine Learning Research
Machine learning-based classifiers ensemble for credit risk assessment

International Journal of Electronic Finance
Naive possibilistic classifiers for imprecise or uncertain numerical data

Fuzzy Sets and Systems

Quantified Score

Hi-index	0.01

Visualization

Abstract

Bayesian networks are a powerful probabilistic representation, and their use for classification has received considerable attention. However, they tend to perform poorly when learned in the standard way. This is attributable to a mismatch between the objective function used (likelihood or a function thereof) and the goal of classification (maximizing accuracy or conditional likelihood). Unfortunately, the computational cost of optimizing structure and parameters for conditional likelihood is prohibitive. In this paper we show that a simple approximation---choosing structures by maximizing conditional likelihood while setting parameters by maximum likelihood---yields good results. On a large suite of benchmark datasets, this approach produces better class probability estimates than naive Bayes, TAN, and generatively-trained Bayesian networks.