Web-based tools and methods for rapid pronunciation dictionary creation

Authors:
Tim Schlippe;Sebastian Ochs;Tanja Schultz
Affiliations:
-;-;-
Venue:
Speech Communication
Year:
2014

Citing 7
Cited 1

Regular models of phonological rule systems

Computational Linguistics - Special issue on computational phonology
Learning pronunciation dictionaries: language complexity and word selection strategies

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Joint-sequence models for grapheme-to-phoneme conversion

Speech Communication
Web derived pronunciations for spoken term detection

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
WEB-derived pronunciations

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Coping with out-of-vocabulary words: Open versus huge vocabulary asr

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Comparing SMT methods for automatic generation of pronunciation variants

IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing

Automatic speech recognition for under-resourced languages: A survey

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we study the potential as well as the challenges of using the World Wide Web as a seed for the rapid generation of pronunciation dictionaries in new languages. In particular, we describe Wiktionary, a community-driven resource of pronunciations in IPA notation, which is available in many different languages. First, we analyze Wiktionary in terms of language and vocabulary coverage and compare it in terms of quality and coverage with another source of pronunciation dictionaries in multiple languages (GlobalPhone). Second, we investigate the performance of statistical grapheme-to-phoneme models in ten different languages and measure the model performance for these languages over the amount of training data. The results show that for the studied languages about 15k phone tokens are sufficient to train stable grapheme-to-phoneme models. Third, we create grapheme-to-phoneme models for ten languages using both the GlobalPhone and the Wiktionary resources. The resulting pronunciation dictionaries are carefully evaluated along several quality checks, i.e. in terms of consistency, complexity, model confidence, grapheme n-gram coverage, and phoneme perplexity. Fourth, as a crucial prerequisite for a fully automated process of dictionary generation, we implement and evaluate methods to automatically remove flawed and inconsistent pronunciations from dictionaries. Last but not least, speech recognition experiments in six languages evaluate the usefulness of the dictionaries in terms of word error rates. Our results indicate that the web resources of Wiktionary can be successfully leveraged to fully automatically create pronunciation dictionaries in new languages.