Using Dependency Grammar Features in Whole Sentence Maximum Entropy Language Model for Speech Recognition

Authors:
Teemu Ruokolainen;Tanel Alumäe;Marcus Dobrinkat
Affiliations:
Department of Information and Computer Science, School of Science and Technology, Aalto University;Laboratory of Phonetics and Speech Technology, Institute of Cybernetics at Tallinn University of Technology;Department of Information and Computer Science, School of Science and Technology, Aalto University
Venue:
Proceedings of the 2010 conference on Human Language Technologies -- The Baltic Perspective: Proceedings of the Fourth International Conference Baltic HLT 2010
Year:
2010

Citing 8
Cited 0

On the limited memory BFGS method for large scale optimization

Mathematical Programming: Series A and B
A non-projective dependency parser

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Improvement of a Whole Sentence Maximum Entropy Language Model using grammatical features

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
The design for the wall street journal-based CSR corpus

HLT '91 Proceedings of the workshop on Speech and Natural Language
A comparison of algorithms for maximum entropy parameter estimation

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Efficient sampling and feature selection in whole sentence maximum entropy language models

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Joint-sequence models for grapheme-to-phoneme conversion

Speech Communication
Importance of High-Order N-Gram Models in Morph-Based Speech Recognition

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In automatic speech recognition, the standard choice for a language model is the well-known n-gram model. The n-grams are used to predict the probability of a word given its n-1 preceding words. However, the n-gram model is not able to explicitly learn grammatical relations of the sentence. In the present work, in order to augment the n-gram model with grammatical features, we apply the Whole Sentence Maximum Entropy framework. The grammatical features are head-modifier relations between pairs of words, together with the labels of the relationships, obtained with the dependency grammar. We evaluate the model in a large vocabulary speech recognition task with Wall Street Journal speech corpus. The results show a substantial improvement in both test set perplexity and word error rate.