GLR parsing with multiple grammars for natural language queries

Authors:
Helen Meng;Po-Chui Luk;Kui Xu;Fuliang Weng
Affiliations:
The Chinese University of Hong Kong;The Chinese University of Hong Kong;The Chinese University of Hong Kong;Robert Bosch Corporation
Venue:
ACM Transactions on Asian Language Information Processing (TALIP)
Year:
2002

Citing 7
Cited 0

Compilers: principles, techniques, and tools

Compilers: principles, techniques, and tools
Introduction to operations research, 4th ed.

Introduction to operations research, 4th ed.
Evaluation of spoken language systems: the ATIS domain

HLT '90 Proceedings of the workshop on Speech and Natural Language
The CMU air travel information service: understanding spontaneous speech

HLT '90 Proceedings of the workshop on Speech and Natural Language
A practical method for constructing LR (k) processors

Communications of the ACM
Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems

Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems
An efficient context-free parsing algorithm

An efficient context-free parsing algorithm

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article presents an approach for parsing natural language queries that integrates multiple subparsers and subgrammars, in contrast to the traditional single grammar and parser approach. In using LR(k) parsers for natural language processing, we are faced with the problem of rapid growth in parsing table sizes as the number of grammar rules increases. We propose to partition the grammar into multiple subgrammars, each having its own parsing table and parser. Grammar partitioning helps reduce the overall parsing table size when compared to using a single grammar. We used the GLR parser with an LR(1) parsing table in our framework because GLR parsers can handle ambiguity in natural language. A parser composition technique then combines the parsers' outputs to produce an overall parse that is the same as the output parse of single parser. Two different strategies were used for parser composition: (i) parser composition by cascading; and (ii) parser composition with predictive pruning.Our experiments were conducted with natural language queries from the ATIS (Air Travel Information Service) domain. We have manually translated the ATIS-3 corpora into Chinese, and consequently we could experiment with grammar partitioning on parallel linguistic corpora. For English, the unpartitioned ATIS grammar has 72,869 states in its parsing table, while the partitioned English grammar has 3,350 states in total. For Chinese, grammar partitioning reduced the overall parsing table size from 29,734 states to 3,894 states. Both results show that grammar partitioning greatly economizes on the overall parsing table size. Language understanding performances were also examined. Parser composition imparts a robust parsing capability in our framework, and hence obtains a higher understanding performance when compared to using a single GLR parser.