Cross-language use of acoustic information for automatic speech recognition

Authors:
C. Nieuwoudt;E. C. Botha
Affiliations:
Department of Electrical, Electronic & Computer Engineering, University of Pretoria, Pretoria 0002, South Africa;Department of Electrical, Electronic & Computer Engineering, University of Pretoria, Pretoria 0002, South Africa
Venue:
Speech Communication
Year:
2002

Citing 3
Cited 1

Multilingual spoken-language understanding in the MIT Voyager system

Speech Communication
In-Service Adaptation of Multilingual Hidden-Markov-Models

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Cross-language acoustic adaptation for automatic speech recognition

Cross-language acoustic adaptation for automatic speech recognition

Location history in a low-cost context awareness environment

ACSW Frontiers '03 Proceedings of the Australasian information security workshop conference on ACSW frontiers 2003 - Volume 21

Quantified Score

Hi-index	0.00

Visualization

Abstract

Techniques are investigated that use acoustic information from existing source language databases to implement automatic speech recognition (ASR) systems for new target languages. The assumption is that the amount of target language data available is too little for the training of a robust ASR system. Strategies for cross-language use of acoustic information are evaluated which include (i) training on pooled source and target language data, (ii) adapting source language models using target language data, (iii) adapting models trained on pooled source and target language using target language data only and (iv) transforming source language data to augment target language data for model training. These strategies are allied with Bayesian and transformation-based techniques to present a framework for cross-language reuse of acoustic information. Experiments are performed for a large number of approaches from the framework, using relatively large amounts of English speech data from either a separate database or from the same-database as smaller amounts of Afrikaans speech data to improve the performance of an Afrikaans speech recogniser. Results indicate that a significant reduction in word error rate is achievable (between 14% and 48% for experiments), depending on the amount of target language data available.