Using Dependency Grammar Features in Whole Sentence Maximum Entropy Language Model for Speech Recognition

  • Authors:
  • Teemu Ruokolainen;Tanel Alumäe;Marcus Dobrinkat

  • Affiliations:
  • Department of Information and Computer Science, School of Science and Technology, Aalto University;Laboratory of Phonetics and Speech Technology, Institute of Cybernetics at Tallinn University of Technology;Department of Information and Computer Science, School of Science and Technology, Aalto University

  • Venue:
  • Proceedings of the 2010 conference on Human Language Technologies -- The Baltic Perspective: Proceedings of the Fourth International Conference Baltic HLT 2010
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In automatic speech recognition, the standard choice for a language model is the well-known n-gram model. The n-grams are used to predict the probability of a word given its n-1 preceding words. However, the n-gram model is not able to explicitly learn grammatical relations of the sentence. In the present work, in order to augment the n-gram model with grammatical features, we apply the Whole Sentence Maximum Entropy framework. The grammatical features are head-modifier relations between pairs of words, together with the labels of the relationships, obtained with the dependency grammar. We evaluate the model in a large vocabulary speech recognition task with Wall Street Journal speech corpus. The results show a substantial improvement in both test set perplexity and word error rate.