Impacts of machine translation and speech synthesis on speech-to-speech translation

Authors:
Kei Hashimoto;Junichi Yamagishi;William Byrne;Simon King;Keiichi Tokuda
Affiliations:
Nagoya Institute of Technology, Department of Computer Science and Engineering, Nagoya, Japan;University of Edinburgh, Centre for Speech Technology Research, Edinburgh, United Kingdom;Cambridge University, Engineering Department, Cambridge, United Kingdom;University of Edinburgh, Centre for Speech Technology Research, Edinburgh, United Kingdom;Nagoya Institute of Technology, Department of Computer Science and Engineering, Nagoya, Japan
Venue:
Speech Communication
Year:
2012

Citing 15
Cited 1

Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds

Speech Communication
Finite-State Speech-to-Speech Translation

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Learning to say it well: reranking realizations by predicted synthesis quality

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Hidden Markov models based on multi-space probability distribution for pitch pattern modeling

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Speech translation: coupling of recognition and translation

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Predicting the fluency of text with shallow structural features: case studies of machine translation and human-written text

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hierarchical phrase-based translation with weighted finite state transducers

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Minimum Bayes risk combination of translation hypotheses from alternative morphological decompositions

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Fast, cheap, and creative: evaluating translation quality using Amazon's Mechanical Turk

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Creating speech and language data with Amazon's Mechanical Turk

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Rating computer-generated questions with Mechanical Turk

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
The wisdom of the crowd's ear: speech accent rating and annotation with Amazon Mechanical Turk

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Amazon mechanical turk: Gold mine or coal mine?

Computational Linguistics
P.563—The ITU-T Standard for Single-Ended Speech Quality Assessment

IEEE Transactions on Audio, Speech, and Language Processing

Analysis of unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis using KLD-based transform mapping

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper analyzes the impacts of machine translation and speech synthesis on speech-to-speech translation systems. A typical speech-to-speech translation system consists of three components: speech recognition, machine translation and speech synthesis. Many techniques have been proposed for integration of speech recognition and machine translation. However, corresponding techniques have not yet been considered for speech synthesis. The focus of the current work is machine translation and speech synthesis, and we present a subjective evaluation designed to analyze their impact on speech-to-speech translation. The results of these analyses show that the naturalness and intelligibility of the synthesized speech are strongly affected by the fluency of the translated sentences. In addition, several features were found to correlate well with the average fluency of the translated sentences and the average naturalness of the synthesized speech.