Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
PCFG models of linguistic tree representations
Computational Linguistics
Supertagging: an approach to almost parsing
Computational Linguistics
Inside-outside estimation of a lexicalized PCFG for German
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Accurate unlexicalized parsing
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Efficient parsing of highly ambiguous context-free grammars with bit vectors
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
The importance of supertagging for wide-coverage CCG parsing
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Re-estimation of lexical parameters for treebank PCFGs
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Hi-index | 0.00 |
We present an approach for smoothing treebank-PCFG lexicons by interpolating treebank lexical parameter estimates with estimates obtained from unannotated data via the Inside-outside algorithm. The PCFG has complex lexical categories, making relative-frequency estimates from a treebank very sparse. This kind of smoothing for complex lexical categories results in improved parsing performance, with a particular advantage in identifying obligatory arguments subcategorized by verbs unseen in the treebank.