Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in intelligent systems: networks of plausible inference
Elements of information theory
Elements of information theory
Connectionist learning of belief networks
Artificial Intelligence
C4.5: programs for machine learning
C4.5: programs for machine learning
Maximum entropy and learning theory
Neural Computation
On the effective implementation of the iterative proportional fitting procedure
Computational Statistics & Data Analysis - Special issue dedicated to Toma´sˇ Havra´nek
A maximum entropy approach to natural language processing
Computational Linguistics
Graphical models for machine learning and digital communication
Graphical models for machine learning and digital communication
Estimating dependency structure as a hidden variable
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Pattern Recognition and Neural Networks
Pattern Recognition and Neural Networks
A Guide to the Literature on Learning Probabilistic Networks from Data
IEEE Transactions on Knowledge and Data Engineering
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
an entropy-driven system for construction of probabilistic expert systems from databases
UAI '90 Proceedings of the Sixth Annual Conference on Uncertainty in Artificial Intelligence
Neural Networks: A Comprehensive Foundation (3rd Edition)
Neural Networks: A Comprehensive Foundation (3rd Edition)
Minimax Entropy Principle and Its Application to Texture Modeling
Neural Computation
Building classifiers using Bayesian networks
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Critic-driven ensemble classification
IEEE Transactions on Signal Processing
A global optimization technique for statistical classifier design
IEEE Transactions on Signal Processing
A Maximum Entropy Approach for Collaborative Filtering
Journal of VLSI Signal Processing Systems
Transductive Methods for the Distributed Ensemble Classification Problem
Neural Computation
An Extension of Iterative Scaling for Decision and Data Aggregation in Ensemble Classification
Journal of VLSI Signal Processing Systems
Hi-index | 0.00 |
We propose a new learning method for discrete space statistical classifiers. Similar to Chow and Liu (1968) and Cheeseman (1983), we cast classification/inference within the more general framework of estimating the joint probability mass function (p.m.f.) for the (feature vector, class label) pair. Cheeseman's proposal to build the maximum entropy (ME) joint p.m.f. consistent with general lower-order probability constraints is in principle powerful, allowing general dependencies between features. However, enormous learning complexity has severely limited the use of this approach. Alternative models such as Bayesian networks (BNs) require explicit determination of conditional independencies. These may be difficult to assess given limited data. Here we propose an approximate ME method, which, like previous methods, incorporates general constraints while retaining quite tractable learning. The new method restricts joint p.m.f. support during learning to a small subset of the full feature space. Classification gains are realized over dependence trees, tree-augmented naive Bayes networks, BNs trained by the Kutato algorithm, and multilayer perceptrons. Extensions to more general inference problems are indicated. We also propose a novel exact inference method when there are several missing features.