Parsing the Wall Street Journal with the inside-outside algorithm

Authors:
Yves Schabes;Michal Roth;Randy Osborne
Affiliations:
Mitsubishi Electric Research Laboratories, Cambridge, MA;Mitsubishi Electric Research Laboratories, Cambridge, MA;Mitsubishi Electric Research Laboratories, Cambridge, MA
Venue:
EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
Year:
1993

Citing 4
Cited 11

Procedure for quantitatively comparing the syntactic coverage of English grammars

HLT '91 Proceedings of the workshop on Speech and Natural Language
Deducing linguistic structure from the statistics of large corpora

HLT '90 Proceedings of the workshop on Speech and Natural Language
Inside-outside reestimation from partially bracketed corpora

ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Development and evaluation of a broad-coverage probabilistic grammar of English-language computer manuals

ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics

A chart re-estimation algorithm for a probabilistic recursive transition network

Computational Linguistics
Estimating Grammar Parameters Using Bounded Memory

ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
Supertagging: an approach to almost parsing

Computational Linguistics
A lightweight dependency analyzer for partial parsing

Natural Language Engineering
Evaluating two methods for Treebank grammar compaction

Natural Language Engineering
A fast method for statistical grammar induction

Natural Language Engineering
Automatic grammar induction and parsing free text: a transformation-based approach

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Towards a more careful evaluation of broad coverage parsing systems

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Supervised grammar induction using training data with limited constituent information

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Automatic grammar induction and parsing free text: a transformation-based approach

HLT '93 Proceedings of the workshop on Human Language Technology
Variational bayesian grammar induction for natural language

ICGI'06 Proceedings of the 8th international conference on Grammatical Inference: algorithms and applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

We report grammar inference experiments on partially parsed sentences taken from the Wall Street Journal corpus using the inside-outside algorithm for stochastic context-free grammars. The initial grammar for the inference process makes no assumption of the kinds of structures and their distributions. The inferred grammar is evaluated by its predicting power and by comparing the bracketing of held out sentences imposed by the inferred grammar with the partial bracketings of these sentences given in the corpus. Using part-of-speech tags as the only source of lexical information, high bracketing accuracy is achieved even with a small subset of the available training material (1045 sentences): 94.4% for test sentences shorter than 10 words and 90.2% for sentences shorter than 15 words.