Learning grammars for different parsing tasks by partition search

Authors:
Anja Belz
Affiliations:
University of Brighton, Brighton, UK
Venue:
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Year:
2002

Citing 14
Cited 2

A practical method for constructing LR (k) processors

Communications of the ACM
PCFG Learning by Nonterminal Partition Search

ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
Discriminative Reranking for Natural Language Parsing

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Context-Sensitive Statistics for Improved Grammatical Language Models

Context-Sensitive Statistics for Improved Grammatical Language Models
Tree-bank Grammars

Tree-bank Grammars
PCFG models of linguistic tree representations

Computational Linguistics
Evaluating two methods for Treebank grammar compaction

Natural Language Engineering
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Robust German noun chunking with a probabilistic context-free grammar

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Applying system combination to base noun phrase identification

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
A comparison of PCFG models

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Introduction to the CoNLL-2000 shared task: chunking

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Use of support vector learning for chunk identification

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Learning computational grammars

ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7

PCFG Learning by Nonterminal Partition Search

ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
Learning phrasal categories

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a comparative application of Grammar Learning by Partition Search to four different learning tasks: deep parsing, NP identification, flat phrase chunking and NP chunking. In the experiments, base grammars were extracted from a treebank corpus. From this starting point, new grammars optimised for the different parsing tasks were learnt by Partition Search. No lexical information was used. In half of the experiments, local structural context in the form of parent phrase category information was incorporated into the grammars. Results show that grammars which contain this information outperform grammars which do not by large margins in all tests for all parsing tasks. It makes the biggest difference for deep parsing, typically corresponding to an improvement of around 5%. Overall, Partition Search with parent phrase category information is shown to be a successful method for learning grammars optimised for a given parsing task, and for minimising grammar size. The biggest margin of improvement over a base grammar was a 5.4% increase in the F-Score for deep parsing. The biggest size reductions were 93.5% fewer nonterminals (for NP identification), and 31.3% fewer rules (for XP chunking)