A method of incorporating bigram constraints into an LR table and its effectiveness in natural language processing

Authors:
Hiroki Imai;Hozumi Tanaka
Affiliations:
Tokyo Institute of Technology, Meguro, Tokyo, Japan;Tokyo Institute of Technology, Meguro, Tokyo, Japan
Venue:
NeMLaP3/CoNLL '98 Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning
Year:
1998

Citing 7
Cited 0

Compilers: principles, techniques, and tools

Compilers: principles, techniques, and tools
Self-organized language modeling for speech recognition

Readings in speech recognition
Robust learning, smoothing, and parameter tying on syntactic ambiguity resolution

Computational Linguistics
Automatic Ambiguity Resolution in Natural Language Processing: An Empirical Approach

Automatic Ambiguity Resolution in Natural Language Processing: An Empirical Approach
Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems

Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems
Automatic Speech Recognition: The Development of the Sphinx Recognition System

Automatic Speech Recognition: The Development of the Sphinx Recognition System
Generalized probabilistic LR parsing of natural language (Corpora) with unification-based grammars

Computational Linguistics - Special issue on using large corpora: I

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a method for constructing bigram LR tables by way of incorporating bigram constraints into an LR table. Using a bigram LR table, it is possible for a GLR parser to make use of both bigram and CFG constraints in natural language processing. Applying bigram LR tables to our GLR method has the following advantages: (1) Language models utilizing bigram LR tables have lower perplexity than simple bigram language models, since local constraints (bigram) and global constraints (CFG) are combined in a single bigram LR table. (2) Bigram constraints are easily acquired from a given corpus. Therefore data sparseness is not likely to arise. (3) Separation of local and global constraints keeps down the number of CFG rules. The first advantage leads to a reduction in complexity, and as the result, better performance in GLR parsing. Our experiments demonstrate the effectiveness of our method.