Acoustically discriminative language model training with pseudo-hypothesis

Authors:
Gakuto Kurata;Abhinav Sethy;Bhuvana Ramabhadran;Ariya Rastrow;Nobuyasu Itoh;Masafumi Nishimura
Affiliations:
IBM Research - Tokyo, IBM Japan, Yamato, Kanagawa, Japan;IBM Research - T.J. Watson Research Center, IBM, Yorktown Heights, NY, USA;IBM Research - T.J. Watson Research Center, IBM, Yorktown Heights, NY, USA;Center for Language and Speech Processing, Johns Hopkins University, Baltimore, MD, USA;IBM Research - Tokyo, IBM Japan, Yamato, Kanagawa, Japan;IBM Research - Tokyo, IBM Japan, Yamato, Kanagawa, Japan
Venue:
Speech Communication
Year:
2012

Citing 10
Cited 0

Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Discriminative language modeling with conditional random fields and the perceptron algorithm

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Contrastive estimation: training log-linear models on unlabeled data

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Training neural network language models on very large corpora

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Discriminative n-gram language modeling

Computer Speech and Language
Acoustically discriminative training for language models

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Refining generative language models using discriminative learning

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Performance prediction for exponential language models

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Shrinking exponential language models

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Average divergence distance as a statistical discrimination measure for hidden Markov models

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently proposed methods for discriminative language modeling require alternate hypotheses in the form of lattices or N-best lists. These are usually generated by an Automatic Speech Recognition (ASR) system on the same speech data used to train the system. This requirement restricts the scope of these methods to corpora where both the acoustic material and the corresponding true transcripts are available. Typically, the text data available for language model (LM) training is an order of magnitude larger than manually transcribed speech. This paper provides a general framework to take advantage of this volume of textual data in the discriminative training of language models. We propose to generate probable N-best lists directly from the text material, which resemble the N-best lists produced by an ASR system by incorporating phonetic confusability estimated from the acoustic model of the ASR system. We present experiments with Japanese spontaneous lecture speech data, which demonstrate that discriminative LM training with the proposed framework is effective and provides modest gains in ASR accuracy.