Automatic acquisition of a large subcategorization dictionary from corpora

Authors:
Christopher D. Manning
Affiliations:
Stanford University, Stanford, CA
Venue:
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Year:
1993

Citing 7
Cited 59

Information-based syntax and semantics: Vol. 1: fundamentals

Information-based syntax and semantics: Vol. 1: fundamentals
Automatic acquisition of subcategorization frames from tagged text

HLT '91 Proceedings of the workshop on Speech and Natural Language
Automatic acquisition of subcategorization frames from untagged text

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Structural ambiguity and lexical relations

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Word association norms, mutual information, and lexicography

ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Converting large on-line valency dictionaries for NLP applications: from proton descriptions to metal frames

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 4

Automatic construction of semantic lexicons for learning natural language interfaces

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Influence of Conditional Independence Assumption on Verb Subcategorization Detection

TSD '01 Proceedings of the 4th International Conference on Text, Speech and Dialogue
Mining Text Data: Special Features and Patterns

Proceedings of the ESF Exploratory Workshop on Pattern Detection and Discovery
Extraction and representation of contextual information for knowledge discovery in texts

Information Sciences—Informatics and Computer Science: An International Journal
Automatic verb classification based on statistical distributions of argument structure

Computational Linguistics
Generalizing case frames using a thesaurus and the MDL principle

Computational Linguistics
Application of finite-state transducers to the acquisition of verb subcategorization information

Natural Language Engineering
Automatic extraction of subcategorization from corpora

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
How verb subcategorization frequencies are affected by corpus choice

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Automatic construction of frame representations for spontaneous speech in unrestricted domains

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Automatic extraction of subcorpora based on subcategorization frames from a part-of-speech tagged corpus

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
A text understander that learns

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Japanese case structure analysis by unsupervised construction of a case frame dictionary

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Hypothesis selection in grammar acquisition

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Improvement in customizability using translation templates

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Comlex Syntax: building a computational lexicon

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Automatic extraction of subcategorization frames for Czech

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Automatic lexical acquisition based on statistical distributions

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Using a hybrid system of corpus and knowledge-based techniques to automate the induction of a lexical sublanguage grammar

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Untangling text data mining

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Japanese case frame construction by coupling the verb and its closest case component

HLT '01 Proceedings of the first international conference on Human language technology research
Learning verb argument structure from minimally annotated corpora

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Can subcategorization help a statistical dependency parser?

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
The Comlex Syntax project: the first year

HLT '94 Proceedings of the workshop on Human Language Technology
Large-Scale Induction and Evaluation of Lexical Resources from the Penn-II and Penn-III Treebanks

Computational Linguistics
Statistical filtering and subcategorization frame acquisition

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Using semantically motivated estimates to help subcategorization acquisition

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Learning argument/adjunct distinction for Basque

ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Semantically motivated subcategorization acquisition

ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Extending the coverage of a valency dictionary

COLING-MTIA '02 Proceedings of the 2002 COLING workshop on Machine translation in Asia - Volume 16
The Proposition Bank: An Annotated Corpus of Semantic Roles

Computational Linguistics
Clustering Syntactic Positions with Similar Semantic Requirements

Computational Linguistics
Mining metalinguistic activity in corpora to create lexical resources using information extraction techniques: the MOP system

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Large-scale induction and evaluation of lexical resources from the Penn-II treebank

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Towards a semantic classification of Spanish verbs based on subcategorisation information

ACLstudent '04 Proceedings of the ACL 2004 workshop on Student research
Automatic acquisition of adjectival subcategorization from corpora

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Parsing and subcategorization data

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Learning verb complements for modern greek: Balancing the noisy dataset

Natural Language Engineering
A method of creating new valency entries

Machine Translation
Acquiring Verb Subcategorization Frames in Bengali from Corpora

ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
A corpus-based analysis of argument realization by preposition structures

Natural Language Engineering
A subcategorization acquisition system for French verbs

HLT-SRWS '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Student Research Workshop
Prepositions in applications: A survey and introduction to the special issue

Computational Linguistics
Mining and re-ranking for answering biographical queries on the web

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Re-estimation of lexical parameters for treebank PCFGs

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Unsupervised discovery of a statistical verb lexicon

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Class-based ordering of prenominal modifiers

ENLG '09 Proceedings of the 12th European Workshop on Natural Language Generation
Acquiring word-meaning mappings for natural language interfaces

Journal of Artificial Intelligence Research
Unsupervised argument identification for Semantic Role Labeling

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Bengali verb subcategorization frame acquisition: a baseline model

ALR7 Proceedings of the 7th Workshop on Asian Language Resources
An information-theoretic based model for large-scale contextual text processing

Information Sciences: an International Journal
A flexible approach to class-based ordering of prenominal modifiers

Empirical methods in natural language generation
Analysis of definitions of verbs in an explanatory dictionary for automatic extraction of actants based on detection of patterns

NLDB'10 Proceedings of the Natural language processing and information systems, and 15th international conference on Applications of natural language to information systems
Acquisition of unknown word paradigms for large-scale grammars

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Extracting idiomatic hungarian verb frames

FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
Unsupervised learning of verb argument structures

CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Statistical machine translation with local language models

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Emotion holder for emotional verbs – the role of subject and syntax

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Incorporating linguistic knowledge in statistical machine translation: translating prepositions

HYBRID '12 Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a new method for producing a dictionary of subcategorization frames from unlabelled text corpora. It is shown that statistical filtering of the results of a finite state parser running on the output of a stochastic tagger produces high quality results, despite the error rates of the tagger and the parser. Further, it is argued that this method can be used to learn all subcategorization frames, whereas previous methods are not extensible to a general solution to the problem.