Self-organized language modeling for speech recognition
Readings in speech recognition
Class-based n-gram models of natural language
Computational Linguistics
Toward a unified approach to statistical language modeling for Chinese
ACM Transactions on Asian Language Information Processing (TALIP)
Probabilistic top-down parsing and language modeling
Computational Linguistics
Immediate-head parsing for language models
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Introduction to the special issue on statistical language modeling
ACM Transactions on Asian Language Information Processing (TALIP)
Unsupervised learning of dependency structure for language modeling
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
An empirical study on language model adaptation
ACM Transactions on Asian Language Information Processing (TALIP)
Approximation lasso methods for language modeling
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Minimum sample risk methods for language modeling
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A comparative study on language model adaptation techniques using new evaluation metrics
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Using Skipping for Sequence-Based Collaborative Filtering
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
A low-order markov model integrating long-distance histories for collaborative recommender systems
Proceedings of the 14th international conference on Intelligent user interfaces
Predicting word pronunciation in Japanese
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
Long distance dependency in language modeling: an empirical study
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Hi-index | 0.00 |
This paper presents several practical ways of incorporating linguistic structure into language models. A headword detector is first applied to detect the headword of each phrase in a sentence. A permuted headword trigram model (PHTM) is then generated from the annotated corpus. Finally, PHTM is extended to a cluster PHTM (C-PHTM) by defining clusters for similar words in the corpus. We evaluated the proposed models on the realistic application of Japanese Kana-Kanji conversion. Experiments show that C-PHTM achieves 15% error rate reduction over the word trigram model. This demonstrates that the use of simple methods such as the headword trigram and predictive clustering can effectively capture long distance word dependency, and substantially outperform a word trigram model.