Morpheme-based and factored language modeling for amharic speech recognition

Authors:
Martha Yifiru Tachbelie;Solomon Teferra Abate;Wolfgang Menzel
Affiliations:
Department of Informatics, University of Hamburg, Hamburg, Germany;Joseph Fourier University, LIG/GETALP, Grenoble Cedex;Department of Informatics, University of Hamburg, Hamburg, Germany
Venue:
LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
Year:
2009

Citing 2
Cited 1

Robustness in Automatic Speech Recognition: Fundamentals and Applications

Robustness in Automatic Speech Recognition: Fundamentals and Applications
Automatic learning of language model structure

COLING '04 Proceedings of the 20th international conference on Computational Linguistics

Using different acoustic, lexical and language modeling units for ASR of an under-resourced language - Amharic

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents the application of morpheme-based and factored language models in an Amharic speech recognition task. Since the use of morphemes in both acoustic and language models often results in performance degradation due to a higher acoustic confusability and since it is problematic to use factored language models in standard word decoders, we applied the models in a lattice rescoring framework. Lattices of 100 best alternatives for each test sentence of the 5k development test set have been generated using a baseline speech recognizer with a word-based backoff bigram language model. The lattices have then been rescored by means of various morpheme-based and factored language models. A slight improvement in word recognition accuracy has been observed with morpheme-based language models while factored language models led to notable improvements in word recognition accuracy.