Structured induction in expert systems
Structured induction in expert systems
Extensions to the CART algorithm
International Journal of Man-Machine Studies
Inferring decision trees using the minimum description length principle
Information and Computation
Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension
COLT '91 Proceedings of the fourth annual workshop on Computational learning theory
Stochastic Complexity in Statistical Inquiry Theory
Stochastic Complexity in Statistical Inquiry Theory
Machine Learning
Incremental Learning from Noisy Data
Machine Learning
UAI '88 Proceedings of the Fourth Annual Conference on Uncertainty in Artificial Intelligence
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
Adaptive mixtures of local experts
Neural Computation
Critical remarks on single link search in learning belief networks
UAI'96 Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence
Properties of Bayesian belief network learning algorithms
UAI'94 Proceedings of the Tenth international conference on Uncertainty in artificial intelligence
Some properties of plausible reasoning
UAI'91 Proceedings of the Seventh conference on Uncertainty in Artificial Intelligence
Hi-index | 0.00 |
This paper describes how a competitive tree learning algorithm can be derived from first principles. The algorithm approximates the Bayesian decision theoretic solution to the learning task. Comparative experiments with the algorithm and the several mature AI and statistical families of tree learning algorithms currently in use show the derived Bayesian algorithm is consistently as good or better, although sometimes at computational cost. Using the same strategy, we can design algorithms for many other supervised and model learning tasks given just a probabilistic representation for the kind of knowledge to be learned. As an illustration, a second learning algorithm is derived for learning Bayesian networks from data. Implications to incremental learning and the use of multiple models are also discussed.