Multilingual pronunciation by analogy

Authors:
Tasanawan Soonklang;Robert i. Damper;Yannick Marchand
Affiliations:
Information: signals, images, systems (isis) research group, school of electronics and computer science, university of southampton, southampton so17 1bj, uk e-mail: anncenter@gmail.com, rid@ecs.so ...;Information: signals, images, systems (isis) research group, school of electronics and computer science, university of southampton, southampton so17 1bj, uk e-mail: anncenter@gmail.com, rid@ecs.so ...;Institute for biodiagnostics (atlantic), national research council canada, neuroimaging research laboratory, 1796 summer street, suite 3900, halifax, nova scotia, canadab3h 3a7 e-mail: yannick.mar ...
Venue:
Natural Language Engineering
Year:
2008

Citing 10
Cited 1

Novel-word pronunciation: a cross-language study

Speech Communication - Speech science and technology: a selection from the papers presented at the Fourth International Conference in Speech Science and Technology (SST-92)
IGTree: Using Trees for Compression and Classification in Lazy LearningAlgorithms

Artificial Intelligence Review - Special issue on lazy learning
An introduction to text-to-speech synthesis

An introduction to text-to-speech synthesis
Forgetting Exceptions is Harmful in Language Learning

Machine Learning - Special issue on natural language learning
Learning from Data: Concepts, Theory, and Methods

Learning from Data: Concepts, Theory, and Methods
Speech Synthesis and Recognition

Speech Synthesis and Recognition
A multistrategy approach to improving pronunciation by analogy

Computational Linguistics
Scaling to very very large corpora for natural language disambiguation

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Information fusion approaches to the automatic pronunciation of print by analogy

Information Fusion
Can syllabification improve pronunciation by analogy of English?

Natural Language Engineering

Syllabification rules versus data-driven methods in a language with low syllabic complexity: The case of Italian

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic pronunciation of unknown words (i.e., those not in the system dictionary) is a difficult problem in text-to-speech (TTS) synthesis. Currently, many data-driven approaches have been applied to the problem, as a backup strategy for those cases where dictionary matching fails. The difficulty of the problem depends on the complexity of spelling-to-sound mappings according to the particular writing system of the language. Hence, the degree of success achieved varies widely across languages but also across dictionaries, even for the same language with the same method. Further, the sizes of the training and test sets are an important consideration in data-driven approaches. In this paper, we study the variation of letter-to-phoneme transcription accuracy across seven European languages with twelve different lexicons. We also study the relationship between the size of dictionary and the accuracy obtained. The largest dictionaries of each language have been partitioned into ten approximately equal-sized subsets and combined to give ten different-sized test sets. In view of its superior performance in previous work, the transcription method used is pronunciation by analogy (PbA). Best results are obtained for Spanish, generally believed to have a very regular (‘shallow’) orthography, and poorest results for English, a language whose irregular spelling system is legendary. For those languages for which multiple dictionaries were available (i.e., French and English), results were found to vary across dictionaries. For the relationship between dictionary size and transcription accuracy, we find that as dictionary size grows, so performance grows monotonically. However, the performance gain decelerates (tends to saturate) as the dictionary increases in size; the relation can simply be described by a logarithmic regression, one parameter of which (α) can be taken as quantifying the depth of orthography of a language. We find that α for a language is significantly correlated with transcription performance on a small dictionary (approximately 10,000 words) for that language, but less so for asymptotic performance. This may be because our measure of asymptotic performance is unreliable, being extrapolated from the fitted logarithmic regression.