Using register-diversified corpora for general language studies
Computational Linguistics - Special issue on using large corpora: II
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Three generative, lexicalised models for statistical parsing
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Automatic acquisition of a large subcategorization dictionary from corpora
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
A new statistical parser based on bigram lexical dependencies
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Disambiguation of super parts of speech (or supertags): almost parsing
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
HLT '91 Proceedings of the workshop on Speech and Natural Language
The Penn Treebank: annotating predicate argument structure
HLT '94 Proceedings of the workshop on Human Language Technology
Statistical parsing with a context-free grammar and word statistics
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Automatic verb classification using distributions of grammatical features
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Acquiring lexical generalizations from corpora: a case study for diathesis alternations
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Large-Scale Induction and Evaluation of Lexical Resources from the Penn-II and Penn-III Treebanks
Computational Linguistics
WCC '00 Proceedings of the workshop on Comparing corpora - Volume 9
Parsing and subcategorization data
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Parsing and subcategorization data
COLING ACL '06 Proceedings of the 21st International Conference on computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
CompareCorpora '00 Proceedings of the Workshop on Comparing Corpora
Robust extraction of subcategorization data from spoken language
Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Exploring variations across biomedical subdomains
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Learning syntactic verb frames using graphical models
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Hi-index | 0.00 |
The probabilistic relation between verbs and their arguments plays an important role in modern statistical parsers and supertaggers, and in psychological theories of language processing. But these probabilities are computed in very different ways by the two sets of researchers. Computational linguists compute verb subcategorization probabilities from large corpora while psycholinguists compute them from psychological studies (sentence production and completion tasks). Recent studies have found differences between corpus frequencies and psycholinguistic measures. We analyze subcategorization frequencies from four different corpora: psychological sentence production data (Connine et al. 1984), written text (Brown and WSJ), and telephone conversation data (Switchboard). We find two different sources for the differences. Discourse influence is a result of how verb use is affected by different discourse types such as narrative, connected discourse, and single sentence productions. Semantic influence is a result of different corpora using different senses of verbs, which have different subcategorization frequencies. We conclude that verb sense and discourse type play an important role in the frequencies observed in different experimental and corpus based sources of verb subcategorization frequencies.