Automatic extraction of subcorpora based on subcategorization frames from a part-of-speech tagged corpus

Authors:
Susanne Gahl
Affiliations:
ICSI, Berkeley, CA
Venue:
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Year:
1998

Citing 2
Cited 8

The Berkeley FrameNet Project

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Automatic acquisition of a large subcategorization dictionary from corpora

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics

Influence of Conditional Independence Assumption on Verb Subcategorization Detection

TSD '01 Proceedings of the 4th International Conference on Text, Speech and Dialogue
The FrameNet tagset for frame-semantic and syntactic coding of predicate-argument structure

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
How verb subcategorization frequencies are affected by corpus choice

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
The Berkeley FrameNet Project

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Can subcategorization help a statistical dependency parser?

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Statistical filtering and subcategorization frame acquisition

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Using semantically motivated estimates to help subcategorization acquisition

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Analysis of definitions of verbs in an explanatory dictionary for automatic extraction of actants based on detection of patterns

NLDB'10 Proceedings of the Natural language processing and information systems, and 15th international conference on Applications of natural language to information systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a method for extracting subcorpora documenting different subcategorization frames for verbs, nouns, and adjectives in the 100 mio. word British National Corpus. The extraction tool consists of a set of batch files for use with the Corpus Query Processor (CQP), which is part of the IMS corpus workbench (cf. Christ 1994a, b). A macroprocessor has been developed that allows the user to specify in a simple input file which subcorpora are to be created for a given lemma.The resulting subcorpora can be used (1) to provide evidence for the subcategorization properties of a given lemma, and to facilitate the selection of corpus lines for lexicographic research, and (2) to determine the frequencies of different syntactic contexts of each lemma.