Improving proper name recognition by means of automatically learned pronunciation variants

Authors:
Bert RéVeil;Jean-Pierre Martens;Henk Van Den Heuvel
Affiliations:
DSSP Group, ELIS, UGent, Sint-Pietersnieuwstraat 41, 9000 Ghent, Belgium;DSSP Group, ELIS, UGent, Sint-Pietersnieuwstraat 41, 9000 Ghent, Belgium;CLST, Faculty of Arts, Radboud Universiteit Nijmegen, The Netherlands
Venue:
Speech Communication
Year:
2012

Citing 10
Cited 1

Pronunciation variants across system configuration, language and speaking style

Speech Communication - Special issue on modeling pronunciation variation for automatic speech recognition
In search of better pronunciation models for speech recognition

Speech Communication - Special issue on modeling pronunciation variation for automatic speech recognition
Stochastic pronunciation modelling from hand-labelled phonetic corpora

Speech Communication - Special issue on modeling pronunciation variation for automatic speech recognition
Modeling pronunciation variation for ASR: a survey of the literature

Speech Communication - Special issue on modeling pronunciation variation for automatic speech recognition
Recognizing speech of goats, wolves, sheep and...non-natives

Speech Communication
Rule-based lexical modelling of foreign-accented pronunciation variants

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Automatic phonetic transcription of large speech corpora

Computer Speech and Language
On using units trained on foreign data for improved multiple accent speech recognition

Speech Communication
Joint-sequence models for grapheme-to-phoneme conversion

Speech Communication
Automatic conversion between pronunciations of different English accents

Speech Communication

An improved two-stage mixed language model approach for handling out-of-vocabulary words in large vocabulary continuous speech recognition

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces a novel lexical modeling approach that aims to improve large vocabulary proper name recognition for native and non-native speakers. The method uses one or more so-called phoneme-to-phoneme (P2P) converters to add useful pronunciation variants to a baseline lexicon. Each P2P converter is a stochastic automaton that applies context-dependent transformation rules to a baseline transcription that is generated by a standard grapheme-to-phoneme (G2P) converter. The paper focuses on the inclusion of different types of features to describe the rule context - ranging from the identities of neighboring phonemes to morphological and even semantic features such as the language of origin of the name - and on the development and assessment of methods that can cope with cross-lingual issues. Another aim is to ensure that the proposed solutions are applicable to new names (not seen during system development) and useful in the hands of product developers with good knowledge of their application domain but little expertise in automatic speech recognition (ASR) and speech corpus acquisition. The proposed method was evaluated on person name and geographical name recognition, two economically interesting domains in which non-native speakers as well as non-native names occur very frequently. For the recognition experiments a state-of-the-art commercial ASR engine was employed. The experimental results demonstrate that significant improvements of the recognition accuracy can be achieved: large gains (up to 40% relative) in case prior knowledge of the speaker tongue and the name origin is available, and still significant gains in case no such prior information is available.