Semi-supervised CCG lexicon extension

Authors:
Emily Thomforde;Mark Steedman
Affiliations:
University of Edinburgh;University of Edinburgh
Venue:
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Year:
2011

Citing 14
Cited 0

The syntactic process

The syntactic process
Unsupervised lexical learning with Categorial Grammars using the LLL corpus

Learning language in logic
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Yet another chart-based technique for parsing ill-formed input

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Some chart-based techniques for parsing ill-formed input

ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Log-linear models for wide-coverage CCG parsing

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Effective self-training for parsing

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank

Computational Linguistics
Wide-coverage efficient statistical parsing with ccg and log-linear models

Computational Linguistics
A psychologically plausible and computationally effective approach to learning syntax

ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
When is self-training effective for parsing?

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Constructing a parser evaluation scheme

CrossParser '08 Coling 2008: Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation
Adapting a lexicalized-grammar parser to contrasting domains

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Statistical parsing with a context-free grammar and word statistics

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces Chart Inference (CI), an algorithm for deriving a CCG category for an unknown word from a partial parse chart. It is shown to be faster and more precise than a baseline brute-force method, and to achieve wider coverage than a rule-based system. In addition, we show the application of CI to a domain adaptation task for question words, which are largely missing in the Penn Treebank. When used in combination with self-training, CI increases the precision of the baseline StatCCG parser over subject-extraction questions by 50%. An error analysis shows that CI contributes to the increase by expanding the number of category types available to the parser, while self-training adjusts the counts.