Bayesian inference for finite-state transducers

Authors:
David Chiang;Jonathan Graehl;Kevin Knight;Adam Pauls;Sujith Ravi
Affiliations:
University of Southern California, Marina del Rey, CA;University of Southern California, Marina del Rey, CA;University of Southern California, Marina del Rey, CA;University of California at Berkeley, Berkeley, CA;University of Southern California, Marina del Rey, CA
Venue:
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Year:
2010

Citing 18
Cited 7

A stochastic finite-state word-segmentation algorithm for Chinese

Computational Linguistics
Breaking substitution ciphers using a relaxation algorithm

Communications of the ACM
Translation with Finite-State Devices

AMTA '98 Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup
Tagging English text with a probabilistic model

Computational Linguistics
Machine transliteration

Computational Linguistics
Memory-Based Learning of morphology with stochastic transducers

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
A generative probabilistic OCR model for NLP applications

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Weighted rational transductions and their application to human language processing

HLT '94 Proceedings of the workshop on Human Language Technology
Incorporating non-local information into information extraction systems by Gibbs sampling

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Unsupervised analysis for decipherment problems

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Training tree transducers

Computational Linguistics
Sampling alignment structure under a Bayesian translation model

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A comparison of Bayesian estimators for unsupervised Hidden Markov Model POS taggers

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Attacking decipherment problems optimally with low-order N-gram models

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Learning phoneme mappings for transliteration without parallel data

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Minimized models for unsupervised part-of-speech tagging

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
A Gibbs sampler for phrasal synchronous grammar induction

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
An overview of probabilistic tree transducers for natural language processing

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing

Efficient optimization of an MDL-inspired objective function for unsupervised part-of-speech tagging

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Deciphering foreign language

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Bayesian inference for Zodiac and other homophonic ciphers

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Bayesian word alignment for statistical machine translation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Models and training for unsupervised preposition sense disambiguation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Semantic parsing with Bayesian tree transducers

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Bayesian Constituent Context Model for Grammar Induction

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe a Bayesian inference algorithm that can be used to train any cascade of weighted finite-state transducers on end-to-end data. We also investigate the problem of automatically selecting from among multiple training runs. Our experiments on four different tasks demonstrate the genericity of this framework, and, where applicable, large improvements in performance over EM. We also show, for unsupervised part-of-speech tagging, that automatic run selection gives a large improvement over previous Bayesian approaches.