Parsing and subcategorization data

Authors:
Jianguo Li
Affiliations:
The Ohio State University, Columbus, OH
Venue:
COLING ACL '06 Proceedings of the 21st International Conference on computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Year:
2006

Citing 12
Cited 0

Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Robust probabilistic predictive syntactic processing: motivations, models, and applications

Robust probabilistic predictive syntactic processing: motivations, models, and applications
Accurate methods for the statistics of surprise and coincidence

Computational Linguistics - Special issue on using large corpora: I
From grammar to lexicon: unsupervised learning of lexical syntax

Computational Linguistics - Special issue on using large corpora: II
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Automatic verb classification based on statistical distributions of argument structure

Computational Linguistics
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
How verb subcategorization frequencies are affected by corpus choice

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Clustering verbs semantically according to their alternation behaviour

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Verb class disambiguation using informative priors

Computational Linguistics
Intricacies of Collins' Parsing Model

Computational Linguistics
Parsing and disfluency placement

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we compare the performance of a state-of-the-art statistical parser (Bikel, 2004) in parsing written and spoken language and in generating subcategorization cues from written and spoken language. Although Bikel's parser achieves a higher accuracy for parsing written language, it achieves a higher accuracy when extracting subcategorization cues from spoken language. Additionally, we explore the utility of punctuation in helping parsing and extraction of subcategorization cues. Our experiments show that punctuation is of little help in parsing spoken language and extracting subcategorization cues from spoken language. This indicates that there is no need to add punctuation in transcribing spoken corpora simply in order to help parsers.