Adaptive Mixtures of Probabilistic Transducers

Authors:
Yoram Singer
Affiliations:
AT&T Labs, Florham Park, NJ 07932, U.S.A.
Venue:
Neural Computation
Year:
1997

Citing 12
Cited 2

Learning probabilistic prediction functions

COLT '88 Proceedings of the first annual workshop on Computational learning theory
Elements of information theory

Elements of information theory
Learning and extracting finite state automata with second-order recurrent neural networks

Neural Computation
C4.5: programs for machine learning

C4.5: programs for machine learning
How to use expert advice

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
The weighted majority algorithm

Information and Computation
Hierarchical mixtures of experts and the EM algorithm

Neural Computation
Predicting nearly as well as the best pruning of a decision tree

COLT '95 Proceedings of the eighth annual conference on Computational learning theory
The power of amnesia: learning probabilistic automata with variable memory length

Machine Learning - Special issue on COLT '94
A stochastic parts program and noun phrase parser for unrestricted text

ANLC '88 Proceedings of the second conference on Applied natural language processing
A statistical model for generating pronunciation networks

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
The context-tree weighting method: basic properties

IEEE Transactions on Information Theory

Non-stationary policy learning in 2-player zero sum games

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
On prediction using variable order Markov models

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe and analyze a mixture model for supervised learning of probabilistic transducers. We devise an online learning algorithm that efficiently infers the structure and estimates the parameters of each probabilistic transducer in the mixture. Theoretical analysis and comparative simulations indicate that the learning algorithm tracks the best transducer from an arbitrarily large (possibly infinite) pool of models. We also present an application of the model for inducing a noun phrase recognizer.