Cross-language use of acoustic information for automatic speech recognition

  • Authors:
  • C. Nieuwoudt;E. C. Botha

  • Affiliations:
  • Department of Electrical, Electronic & Computer Engineering, University of Pretoria, Pretoria 0002, South Africa;Department of Electrical, Electronic & Computer Engineering, University of Pretoria, Pretoria 0002, South Africa

  • Venue:
  • Speech Communication
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Techniques are investigated that use acoustic information from existing source language databases to implement automatic speech recognition (ASR) systems for new target languages. The assumption is that the amount of target language data available is too little for the training of a robust ASR system. Strategies for cross-language use of acoustic information are evaluated which include (i) training on pooled source and target language data, (ii) adapting source language models using target language data, (iii) adapting models trained on pooled source and target language using target language data only and (iv) transforming source language data to augment target language data for model training. These strategies are allied with Bayesian and transformation-based techniques to present a framework for cross-language reuse of acoustic information. Experiments are performed for a large number of approaches from the framework, using relatively large amounts of English speech data from either a separate database or from the same-database as smaller amounts of Afrikaans speech data to improve the performance of an Afrikaans speech recogniser. Results indicate that a significant reduction in word error rate is achievable (between 14% and 48% for experiments), depending on the amount of target language data available.