A Cache-Based Natural Language Model for Speech Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Language Model Adaptation Using Mixtures and an Exponentially Decaying Cache
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
The Journal of Machine Learning Research
An empirical study of smoothing techniques for language modeling
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
The author-topic model for authors and documents
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Analysis and processing of lecture audio data: preliminary investigations
SpeechIR '04 Proceedings of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL 2004
Modeling Topic and Role Information in Meetings Using the Hierarchical Dirichlet Process
MLMI '08 Proceedings of the 5th international workshop on Machine Learning for Multimodal Interaction
Bilingual LSA-based adaptation for statistical machine translation
Machine Translation
Using LDA to detect semantically incoherent documents
CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
N-gram weighting: reducing training data mismatch in cross-domain language model estimation
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Topic tracking language model for speech recognition
Computer Speech and Language
Imposing hierarchical browsing structures onto spoken documents
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
A normalized-cut alignment model for mapping hierarchical semantic structures onto spoken documents
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
A two-stage domain selection framework for extensible multi-domain spoken dialogue systems
SIGDIAL '11 Proceedings of the SIGDIAL 2011 Conference
Topic adaptation for lecture translation through bilingual latent semantic models
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Hi-index | 0.00 |
Adapting language models across styles and topics, such as for lecture transcription, involves combining generic style models with topic-specific content relevant to the target document. In this work, we investigate the use of the Hidden Markov Model with Latent Dirichlet Allocation (HMM-LDA) to obtain syntactic state and semantic topic assignments to word instances in the training corpus. From these context-dependent labels, we construct style and topic models that better model the target document, and extend the traditional bag-of-words topic models to n-grams. Experiments with static model interpolation yielded a perplexity and relative word error rate (WER) reduction of 7.1% and 2.1%, respectively, over an adapted trigram baseline. Adaptive interpolation of mixture components further reduced perplexity by 9.5% and WER by a modest 0.3%.