Automatic acquisition of subcategorization frames from tagged text

Authors:
Michael R. Brent;Robert C. Berwick
Affiliations:
-;-
Venue:
HLT '91 Proceedings of the workshop on Speech and Natural Language
Year:
1991

Citing 7
Cited 2

Word association norms, mutual information, and lexicography

Computational Linguistics
Deducing linguistic structure from the statistics of large corpora

HLT '90 Proceedings of the workshop on Speech and Natural Language
A stochastic parts program and noun phrase parser for unrestricted text

ANLC '88 Proceedings of the second conference on Applied natural language processing
Automatic semantic classification of verbs from their syntactic contexts: an implemented classifier for stativity

EACL '91 Proceedings of the fifth conference on European chapter of the Association for Computational Linguistics
Parsing the LOB corpus

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Automatically extracting and representing collocations for language generation

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Noun classification from predicate-argument structures

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics

Automatic acquisition of a large subcategorization dictionary from corpora

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Learning semantic grammars with constructive inductive logic programming

AAAI'93 Proceedings of the eleventh national conference on Artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes an implemented program that takes a tagged text corpus and generates a partial list of the subcategorization frames in which each verb occurs. The completeness of the output list increases monotonically with the total occurrences of each verb in the training corpus. False positive rates are one to three percent. Five subcategorization frames are currently detected and we foresee no impediment to detecting many more. Ultimately, we expect to provide a large subcategorization dictionary to the NLP community and to train dictionaries for specific corpora.