Word association norms, mutual information, and lexicography
Computational Linguistics
Deducing linguistic structure from the statistics of large corpora
HLT '90 Proceedings of the workshop on Speech and Natural Language
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
EACL '91 Proceedings of the fifth conference on European chapter of the Association for Computational Linguistics
ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Automatically extracting and representing collocations for language generation
ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Noun classification from predicate-argument structures
ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Automatic acquisition of a large subcategorization dictionary from corpora
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Learning semantic grammars with constructive inductive logic programming
AAAI'93 Proceedings of the eleventh national conference on Artificial intelligence
Hi-index | 0.00 |
This paper describes an implemented program that takes a tagged text corpus and generates a partial list of the subcategorization frames in which each verb occurs. The completeness of the output list increases monotonically with the total occurrences of each verb in the training corpus. False positive rates are one to three percent. Five subcategorization frames are currently detected and we foresee no impediment to detecting many more. Ultimately, we expect to provide a large subcategorization dictionary to the NLP community and to train dictionaries for specific corpora.