Head-Transducer Models for Speech Translation and TheirAutomatic Acquisition from Bilingual Data

Authors:
Hiyan Alshawi;Srinivas Bangalore;Shona Douglas
Affiliations:
AT & T Labs Research, 180 Park Avenue, PO Box 971, Florham Park, NJ 07932, USA;AT & T Labs Research, 180 Park Avenue, PO Box 971, Florham Park, NJ 07932, USA;AT & T Labs Research, 180 Park Avenue, PO Box 971, Florham Park, NJ 07932, USA
Venue:
Machine Translation
Year:
2000

Citing 11
Cited 6

A statistical approach to machine translation

Computational Linguistics
Identifying word correspondence in parallel texts

HLT '91 Proceedings of the workshop on Speech and Natural Language
An efficient context-free parsing algorithm

Communications of the ACM
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
Text and speech translation by means of subsequential transducers

Natural Language Engineering
A comparison of head transducers and transfer for a limited domain translation application

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A model for robust processing of spontaneous speech by integrating viable fragments

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Automatic acquisition of hierarchical transduction models for machine translation

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
A new statistical parser based on bigram lexical dependencies

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Multi-site data collection and evaluation in spoken language understanding

HLT '93 Proceedings of the workshop on Human Language Technology

MARS: A Statistical Semantic Parsing and Generation-Based Multilingual Automatic tRanslation System

Machine Translation
Probabilistic Finite-State Machines-Part II

IEEE Transactions on Pattern Analysis and Machine Intelligence
Probabilistic Finite-State Machines-Part I

IEEE Transactions on Pattern Analysis and Machine Intelligence
Speech translation performance of statistical dependency transduction and semantic similarity transduction

S2S '02 Proceedings of the ACL-02 workshop on Speech-to-speech translation: algorithms and systems - Volume 7
High-quality speech-to-speech translation for computer-aided language learning

ACM Transactions on Speech and Language Processing (TSLP)
Unsupervised part-of-speech tagging with bilingual graph-based projections

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article presents statistical language translation models,called ``dependency transduction models'', based on collectionsof ``head transducers''. Head transducers are middle-out finite-state transducers which translate a head word in a source stringinto its corresponding head in the target language, and furthertranslate sequences of dependents of the source head into sequencesof dependents of the target head. The models are intended to capturethe lexical sensitivity of direct statistical translation models,while at the same time taking account of the hierarchical phrasalstructure of language. Head transducers are suitable for directrecursive lexical translation, and are simple enough to be trainedfully automatically. We present a method for fully automatictraining of dependency transduction models for which the only inputis transcribed and translated speech utterances. The method has beenapplied to create English–Spanish and English–Japanese translationmodels for speech translation applications. The dependencytransduction model gives around 75% accuracy for an English–Spanishtranslation task (using a simple string edit-distance measure) and70% for an English–Japanese translation task. Enhanced with targetn-grams and a case-based component, English–Spanish accuracy is over76%; for English–Japanese it is 73% for transcribed speech, and60% for translation from recognition word lattices.