Automatic extraction of subcategorization frames for Czech

Authors:
Anoop Sarkar;Daniel Zeman
Affiliations:
Univ of Pennsylvania, Philadelphia, PA;Univerzita Karlova, Praha, Czechia
Venue:
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Year:
2000

Citing 10
Cited 24

A statistical syntactic disambiguation program and what it learns

Connectionist, Statistical, and Symbolic Approaches to Learning for Natural Language Processing
Accurate methods for the statistics of surprise and coincidence

Computational Linguistics - Special issue on using large corpora: I
From grammar to lexicon: unsupervised learning of lexical syntax

Computational Linguistics - Special issue on using large corpora: II
Automatic extraction of subcategorization from corpora

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Automatic verb classification using distributions of grammatical features

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Tagging inflective languages: prediction of morphological categories for a rich, structured tagset

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Automatic acquisition of subcategorization frames from untagged text

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Automatic acquisition of a large subcategorization dictionary from corpora

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Automatic acquisition of the lexical semantics of verbs from sentence frames

ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Acquiring lexical generalizations from corpora: a case study for diathesis alternations

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics

Influence of Conditional Independence Assumption on Verb Subcategorization Detection

TSD '01 Proceedings of the 4th International Conference on Text, Speech and Dialogue
Application of finite-state transducers to the acquisition of verb subcategorization information

Natural Language Engineering
Learning verb argument structure from minimally annotated corpora

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Can subcategorization help a statistical dependency parser?

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Improving subcategorization acquisition using word sense disambiguation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Large-Scale Induction and Evaluation of Lexical Resources from the Penn-II and Penn-III Treebanks

Computational Linguistics
Statistical filtering and subcategorization frame acquisition

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Using semantically motivated estimates to help subcategorization acquisition

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Learning argument/adjunct distinction for Basque

ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Semantically motivated subcategorization acquisition

ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Improving subcategorization acquisition with WSD

WSD '02 Proceedings of the ACL-02 workshop on Word sense disambiguation: recent successes and future directions - Volume 8
Subcategorization acquisition and evaluation for Chinese verbs

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Learning Greek verb complements: addressing the class imbalance

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Improving English subcategorization acquisition with diathesis alternations as heuristic information

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Parsing and subcategorization data

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Learning verb complements for modern greek: Balancing the noisy dataset

Natural Language Engineering
The effect of borderline examples on language learning

Journal of Experimental & Theoretical Artificial Intelligence
Acquiring Verb Subcategorization Frames in Bengali from Corpora

ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
Robust extraction of subcategorization data from spoken language

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Bengali verb subcategorization frame acquisition: a baseline model

ALR7 Proceedings of the 7th Workshop on Asian Language Resources
Fully unsupervised core-adjunct argument classification

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Unsupervised learning of verb argument structures

CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Problems of inducing large coverage constraint-based dependency grammar for czech

CSLP'04 Proceedings of the First international conference on Constraint Solving and Language Processing
Incorporating linguistic knowledge in statistical machine translation: translating prepositions

HYBRID '12 Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present some novel machine learning techniques for the identification of subcategorization information for verbs in Czech. We compare three different statistical techniques applied to this problem. We show how the learning algorithm can be used to discover previously unknown subcategorization frames from the Czech Prague Dependency Treebank. The algorithm can then be used to label dependents of a verb in the Czech treebank as either arguments or adjuncts. Using our techniques, we are able to achieve 88% precision on unseen parsed text.