Text compression
COLT '90 Proceedings of the third annual workshop on Computational learning theory
Learning probabilistic prediction functions
COLT '88 Proceedings of the first annual workshop on Computational learning theory
C4.5: programs for machine learning
C4.5: programs for machine learning
The weighted majority algorithm
Information and Computation
The power of amnesia: learning probabilistic automata with variable memory length
Machine Learning - Special issue on COLT '94
Predicting Nearly As Well As the Best Pruning of a Decision Tree
Machine Learning - Special issue on the eighth annual conference on computational learning theory, (COLT '95)
Journal of the ACM (JACM)
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Adaptive mixtures of probabilistic transducers
Neural Computation
Statistical methods for speech recognition
Statistical methods for speech recognition
A universal finite memory source
IEEE Transactions on Information Theory
The context-tree weighting method: basic properties
IEEE Transactions on Information Theory
Efficiently Approximating Weighted Sums with Exponentially Many Terms
COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Learning theory and language modeling
Exploring artificial intelligence in the new millennium
Detecting errors within a corpus using anomaly detection
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
On approximating weighted sums with exponentially many terms
Journal of Computer and System Sciences
Selective Rademacher Penalization and Reduced Error Pruning of Decision Trees
The Journal of Machine Learning Research
Learning prediction suffix trees with Winnow
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
An analysis of reduced error pruning
Journal of Artificial Intelligence Research
Individual sequence prediction using memory-efficient context trees
IEEE Transactions on Information Theory
Being Bayesian about network structure
UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
Tracking the best of many experts
COLT'05 Proceedings of the 18th annual conference on Learning Theory
Hi-index | 0.06 |
We present an efficient method for maintaining mixtures of prunings of a prediction or decision tree that extends the previous methods for “node-based” prunings (Buntine, 1990; Willems, Shtarkov, & Tjalkens, 1995; Helmbold & Schapire, 1997; Singer, 1997)to the larger class of edge-based prunings. The method includes an online weight-allocation algorithm that can be used for prediction, compression and classification. Although the set of edge-based prunings of a given tree is much larger than that of node-based prunings, our algorithm has similar space and time complexity to that of previous mixture algorithms for trees. Using the general online framework of Freund and Schapire (1997), we prove that our algorithm maintains correctly the mixture weights for edge-based prunings with any bounded loss function. We also give a similar algorithm for the logarithmic loss function with a corresponding weight-allocation algorithm. Finally, we describe experiments comparing node-based and edge-based mixture models for estimating the probability of the next word in English text, which show the advantages of edge-based models.