Generalized probabilistic LR parsing of natural language (Corpora) with unification-based grammars

Authors:
Ted Briscoe;John Carroll
Affiliations:
University of Cambridge;University of Cambridge
Venue:
Computational Linguistics - Special issue on using large corpora: I
Year:
1993

Citing 38
Cited 71

Compilers: principles, techniques, and tools

Compilers: principles, techniques, and tools
An efficient augmented-context-free parsing algorithm

Computational Linguistics
Prolog and natural-language analysis

Prolog and natural-language analysis
LR PARSING theory and practice

LR PARSING theory and practice
Information-based syntax and semantics: Vol. 1: fundamentals

Information-based syntax and semantics: Vol. 1: fundamentals
A computational framework for lexical description

Computational Linguistics - Special issue of the lexicon
Grammatical category disambiguation by statistical optimization

Computational Linguistics
Modelling Human Speech Comprehension: A computational approach

Modelling Human Speech Comprehension: A computational approach
The Parser Generating System PGS

Software—Practice & Experience
The derivation of a large computational lexicon for English from LDOCE

Computational lexicography for natural language processing
Lalr—a generator for efficient parsers

Software—Practice & Experience
Natural language analysis by stochastic optimization: a progress report on project APRIL

Journal of Experimental & Theoretical Artificial Intelligence
A Trellis-based algorithm for estimating the parameters of a hidden stochastic context-free grammar

HLT '91 Proceedings of the workshop on Speech and Natural Language
A unifying model for lookahead LR parsing

Computer Languages
Generating a grammar for statistical training

HLT '90 Proceedings of the workshop on Speech and Natural Language
Poor estimates of context are worse than none

HLT '90 Proceedings of the workshop on Speech and Natural Language
Defaults in lexical representation

Inheritance, defaults and the lexicon
Methods for Computing LALR(k) Lookahead

ACM Transactions on Programming Languages and Systems (TOPLAS)
Efficient Computation of LALR(1) Look-Ahead Sets

ACM Transactions on Programming Languages and Systems (TOPLAS)
An efficient context-free parsing algorithm

Communications of the ACM
Natural Language Processing in LISP: An Introduction to Computational Linguistics

Natural Language Processing in LISP: An Introduction to Computational Linguistics
Speech Synthesis and Recognition

Speech Synthesis and Recognition
Coping with syntactic ambiguity or how to put the block in the box on the table

Computational Linguistics
The syntactic regularity of English noun phrases

EACL '89 Proceedings of the fourth conference on European chapter of the Association for Computational Linguistics
An extended LR parsing algorithm for grammars using feature-based syntactic categories

EACL '91 Proceedings of the fifth conference on European chapter of the Association for Computational Linguistics
Parse fitting and prose fixing: getting a hold on ill-formedness

Computational Linguistics - Special issue on ill-formed input
LR parsers for natural languages

ACL '84 Proceedings of the 10th International Conference on Computational Linguistics and 22nd annual meeting on Association for Computational Linguistics
The design of a computer language for linguistic information

ACL '84 Proceedings of the 10th International Conference on Computational Linguistics and 22nd annual meeting on Association for Computational Linguistics
Using restriction to extend parsing algorithms for complex-feature-based formalisms

ACL '85 Proceedings of the 23rd annual meeting on Association for Computational Linguistics
Sentence disambiguation by a shift-reduce parsing technique

ACL '83 Proceedings of the 21st annual meeting on Association for Computational Linguistics
Polynomial time and space shift-reduce parsing of arbitrary context-free grammars

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Automatic acquisition of subcategorization frames from untagged text

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Structural ambiguity and lexical relations

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Word association norms, mutual information, and lexicography

ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Inside-outside reestimation from partially bracketed corpora

ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
An integrated framework for semantic and pragmatic interpretation

ACL '88 Proceedings of the 26th annual meeting on Association for Computational Linguistics
Lexing and parsing Modula-2

ACM SIGPLAN Notices
Software support for practical grammar development

COLING '88 Proceedings of the 12th conference on Computational linguistics - Volume 1

Stroing logical form in a shared-packed forest

Computational Linguistics
An efficient probabilistic context-free parsing algorithm that computes prefix probabilities

Computational Linguistics
Robust learning, smoothing, and parameter tying on syntactic ambiguity resolution

Computational Linguistics
A chart re-estimation algorithm for a probabilistic recursive transition network

Computational Linguistics
Learning to Parse Natural Language with Maximum Entropy Models

Machine Learning - Special issue on natural language learning
Learning log-linear models on constraint-based grammars for disambiguation

Learning language in logic
Improving learning by choosing examples intelligently in two natural language tasks

Learning language in logic
Logic-based machine learning

Logic-based artificial intelligence
A Theory of Stochastic Grammars

NLP '00 Proceedings of the Second International Conference on Natural Language Processing
Probabilistic top-down parsing and language modeling

Computational Linguistics
Surface-marker-based dialog modelling: A progress report on the MAREDI project

Natural Language Engineering
Robustness beyond shallowness: incremental deep parsing

Natural Language Engineering
An HPSG parser with CFG filtering

Natural Language Engineering
Robust grammatical analysis for spoken dialogue systems

Natural Language Engineering
Improving language models by clustering training sentences

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Combination of symbolic and statistical approaches for grammatical knowledge acquisition

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
The DINOUS parser

Natural Language Engineering
Automatic extraction of subcategorization from corpora

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
The problem of computing the most probable tree in data-oriented parsing and stochastic tree grammars

EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
MindNet: acquiring and structuring semantic information from text

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Precise n-gram probabilities from stochastic context-free grammars

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Relating complexity to practical performance in parsing with wide-coverage unification grammars

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
An integrated heuristic scheme for partial parse evaluation

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
A new statistical parser based on bigram lexical dependencies

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Fast parsing using pruning and grammar specialization

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Parsing with the shortest derivation

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
A hybrid Japanese parser with hand-crafted grammar and statistics

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
A reestimation algorithm for Probabilistic Recursive Transition Network

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Computing first and follow functions for feature theoretic grammars

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Weakly restricted stochastic grammars

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Notes on LR parser design

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
LHIP: extended DCGs for configurable robust parsing

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Using discourse predictions for ambiguity resolution

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
A context-sensitive model for probabilistic LR parsing of spoken language with transformation-based postprocessing

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Robust, finite-state parsing for spoken language understanding

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Relating probabilistic grammars and automata

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
A comparison of parsing technologies for the biomedical domain

Natural Language Engineering
Using grammatical relations to compare parsers

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
A novel disambiguation method for unification-based grammars using probabilistic context-free approximations

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
XML-based data preparation for robust deep parsing

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Joint and conditional estimation of tagging and parsing models

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
An information-theory-based feature type analysis for the modelling of statistical parsing

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Disambiguating Nouns, Verbs, and Adjectives Using Automatically Acquired Selectional Preferences

Computational Linguistics
Using semantically motivated estimates to help subcategorization acquisition

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Fast LR parsing using rich (Tree Adjoining) Grammars

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
XML-based NLP tools for analysing and annotating medical language

NLPXML '02 Proceedings of the 2nd workshop on NLP and XML - Volume 17
An efficient LR parser generator for tree-adjoining grammars

New developments in parsing technology
Optimal ambiguity packing in context-free parsers with interleaved unification

New developments in parsing technology
Probabilistic parsing strategies

Journal of the ACM (JACM)
Probabilistic parsing strategies

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
An alternative method of training probabilistic LR parsers

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A best-first probabilistic shift-reduce parser

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
From ubgs to cfgs a practical corpus-driven approach

Natural Language Engineering
Wide-coverage deep statistical parsing using automatic dependency structure annotation

Computational Linguistics
A general feature space for automatic verb classification

Natural Language Engineering
Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models

Natural Language Engineering
A method of incorporating bigram constraints into an LR table and its effectiveness in natural language processing

NeMLaP3/CoNLL '98 Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning
Deterministic shift-reduce parsing for unification-based grammars by using default unification

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
An efficient algorithm to induce minimum average lookahead grammars for incremental LR parsing

IncrementParsing '04 Proceedings of the Workshop on Incremental Parsing: Bringing Engineering and Cognition Together
Backbone extraction and pruning for speeding up a deep parser for dialogue systems

ScaNaLU '06 Proceedings of the Third Workshop on Scalable Natural Language Understanding
Efficacy of beam thresholding, unification filtering and hybrid parsing in probabilistic HPSG parsing

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
PrepLex: a lexicon of French prepositions for parsing

SigSem '07 Proceedings of the Fourth ACL-SIGSEM Workshop on Prepositions
Transition-based parsing of the Chinese treebank using a global discriminative model

IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
Growing TreeLex

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Generalized queries on probabilistic context-free grammars

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Multithreaded parsing for predicting RNA secondary structures

International Journal of Bioinformatics Research and Applications
Computational linguistics and natural language processing

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
Dependency syntax analysis using grammar induction and a lexical categories precedence system

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
Syntactic processing using the generalized perceptron and beam search

Computational Linguistics
Anaphora resolution with word sense disambiguation

SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems
Efficient large-scale parsing: a survey

Proceedings of the COLING-2000 Workshop on Efficiency In Large-Scale Parsing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe work toward the construction of a very wide-coverage probabilistic parsing system for natural language (NL), based on LR parsing techniques. The system is intended to rank the large number of syntactic analyses produced by NL grammars according to the frequency of occurrence of the individual rules deployed in each analysis. We discuss a fully automatic procedure for constructing an LR parse table from a unification-based grammar formalism, and consider the suitability of alternative LALR(1) parse table construction methods for large grammars. The parse table is used as the basis for two parsers; a user-driven interactive system that provides a computationally tractable and labor-efficient method of supervised training of the statistical information required to drive the probabilistic parser. The latter is constructed by associating probabilities with the LR parse table directly. This technique is superior to parsers based on probabilistic lexical tagging or probabilistic context-free grammar because it allows for a more context-dependent probabilistic language model, as well as use of a more linguistically adequate grammar formalism. We compare the performance of an optimized variant of Tomita's (1987) generalized LR parsing algorithm to an (efficiently indexed and optimized) chart parser. We report promising results of a pilot study training on 150 noun definitions from the Longman Dictionary of Contemporary English (LDOCE) and retesting on these plus a further 55 definitions. Finally, we discuss limitations of the current system and possible extensions to deal with lexical (syntactic and semantic) frequency of occurrence.