Statistical Machine Translation of Broadcast News from Spanish to Portuguese

Authors:
Raquel Sánchez Martínez;João Paulo Silva Neto;Diamantino António Caseiro
Affiliations:
L2F - Spoken Language Systems Laboratory, INESC ID Lisboa, Lisboa, Portugal 1000-029;L2F - Spoken Language Systems Laboratory, INESC ID Lisboa, Lisboa, Portugal 1000-029;L2F - Spoken Language Systems Laboratory, INESC ID Lisboa, Lisboa, Portugal 1000-029
Venue:
PROPOR '08 Proceedings of the 8th international conference on Computational Processing of the Portuguese Language
Year:
2008

Citing 3
Cited 0

Self-organized language modeling for speech recognition

Readings in speech recognition
Robust speech recognition using the modulation spectrogram

Speech Communication - Special issue on robust speech recognition
Building Language Models for Continuous Speech Recognition Systems

PorTAL '02 Proceedings of the Third International Conference on Advances in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we describe the work carried out to develop an automatic system for translation of broadcast news from Spanish to Portuguese. Two challenging topics of speech and language processing were involved: Automatic Speech Recognition (ASR) of the Spanish News and Statistical Machine Translation (SMT) of the results to the Portuguese language. ASR of broadcast news is based on the AUDIMUS.MEDIA system, a hybrid ANN/HMM system with multiple stream decoding. A 22.08% Word Error Rate (WER) was achieved in a Spanish Broadcast News task, which is comparable to other international state of the art systems. Parallel normalized texts from European Parliament database were used to train the SMT system from Spanish to Portuguese. Preliminary non-exhaustive human evaluation showed a fluency of 3.74 and sufficiency of 4.23.