Morpheme-based and factored language modeling for amharic speech recognition

  • Authors:
  • Martha Yifiru Tachbelie;Solomon Teferra Abate;Wolfgang Menzel

  • Affiliations:
  • Department of Informatics, University of Hamburg, Hamburg, Germany;Joseph Fourier University, LIG/GETALP, Grenoble Cedex;Department of Informatics, University of Hamburg, Hamburg, Germany

  • Venue:
  • LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents the application of morpheme-based and factored language models in an Amharic speech recognition task. Since the use of morphemes in both acoustic and language models often results in performance degradation due to a higher acoustic confusability and since it is problematic to use factored language models in standard word decoders, we applied the models in a lattice rescoring framework. Lattices of 100 best alternatives for each test sentence of the 5k development test set have been generated using a baseline speech recognizer with a word-based backoff bigram language model. The lattices have then been rescored by means of various morpheme-based and factored language models. A slight improvement in word recognition accuracy has been observed with morpheme-based language models while factored language models led to notable improvements in word recognition accuracy.