A probabilistic parser

Authors:
Roger Garside;Fanny Leech
Affiliations:
University of Lancaster, Bailrigg, Lancaster, U.K.;University of Lancaster, Bailrigg, Lancaster, U.K.
Venue:
EACL '85 Proceedings of the second conference on European chapter of the Association for Computational Linguistics
Year:
1985

Citing 2
Cited 3

Choice of grammatical word-class without global syntactic analysis: tagging words in the LOB Corpus.

Computers and the Humanities
Grammatical analysis by computer of the Lancaster-Oslo/Bergen (LOB) corpus of British English texts

ACL '85 Proceedings of the 23rd annual meeting on Association for Computational Linguistics

Tagging English text with a probabilistic model

Computational Linguistics
Automatic learning for semantic collocation

ANLC '92 Proceedings of the third conference on Applied natural language processing
Extracting noun phrases from large-scale texts: a hybrid approach and its automatic evaluation

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The UCREL team at the University of Lancaster is engaged in the development of a robust parsing mechanism, which will assign the appropriate grammatical structure to sentences in unconstrained English text. The techniques used involve the calculation of probabilities for competing structures, and are based on the techniques successfully used in tagging (i.e. assigning grammatical word classes) to the LOB (Lancaster-Oslo/Bergen) corpus.The first step in the parsing process involves dictionary lookup of successive pairs of grammatically tagged words, to give a number of possible continuations to the current parse. Since this lookup will often not be able unambiguously to distinguish the point at which a grammatical constituent should be closed, the second step of the parsing process will have to insert closures and distinguish between alternative parses. It will generate trees representing these possible alternatives, insert closure points for the constituents, and compute a probability for each parse tree from the probability of each constituent within the tree. It will then be able to select a preferred parse or parses for output.The probability of a grammatical constituent is derived from a bank of manually parsed sentences.