Exploiting headword dependency and predictive clustering for language modeling

Authors:
Jianfeng Gao;Hisami Suzuki;Yang Wen
Affiliations:
Microsoft Research, Beijing, China;Microsoft Research, Redmond WA;Information Science of Tsinghua University, China
Venue:
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Year:
2002

Citing 5
Cited 10

Self-organized language modeling for speech recognition

Readings in speech recognition
Class-based n-gram models of natural language

Computational Linguistics
Toward a unified approach to statistical language modeling for Chinese

ACM Transactions on Asian Language Information Processing (TALIP)
Probabilistic top-down parsing and language modeling

Computational Linguistics
Immediate-head parsing for language models

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics

Introduction to the special issue on statistical language modeling

ACM Transactions on Asian Language Information Processing (TALIP)
Unsupervised learning of dependency structure for language modeling

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
An empirical study on language model adaptation

ACM Transactions on Asian Language Information Processing (TALIP)
Approximation lasso methods for language modeling

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Minimum sample risk methods for language modeling

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A comparative study on language model adaptation techniques using new evaluation metrics

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Using Skipping for Sequence-Based Collaborative Filtering

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
A low-order markov model integrating long-distance histories for collaborative recommender systems

Proceedings of the 14th international conference on Intelligent user interfaces
Predicting word pronunciation in Japanese

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
Long distance dependency in language modeling: an empirical study

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents several practical ways of incorporating linguistic structure into language models. A headword detector is first applied to detect the headword of each phrase in a sentence. A permuted headword trigram model (PHTM) is then generated from the annotated corpus. Finally, PHTM is extended to a cluster PHTM (C-PHTM) by defining clusters for similar words in the corpus. We evaluated the proposed models on the realistic application of Japanese Kana-Kanji conversion. Experiments show that C-PHTM achieves 15% error rate reduction over the word trigram model. This demonstrates that the use of simple methods such as the headword trigram and predictive clustering can effectively capture long distance word dependency, and substantially outperform a word trigram model.