Verb class discovery from rich syntactic data

Authors:
Lin Sun;Anna Korhonen;Yuval Krymolowski
Affiliations:
Computer Laboratory, University of Cambridge, Cambridge, UK;Computer Laboratory, University of Cambridge, Cambridge, UK;Department of Computer Science, University of Haifa, Haifa, Israel
Venue:
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Year:
2008

Citing 16
Cited 10

Large lexicons for natural language processing: utilising the grammar coding system of LDOCE

Computational Linguistics - Special issue of the lexicon
The nature of statistical learning theory

The nature of statistical learning theory
Inducing Features of Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
Large-Scale Dictionary Construction for ForeignLanguage Tutoring and Interlingual Machine Translation

Machine Translation
Class-Based Construction of a Verb Lexicon

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Automatic verb classification based on statistical distributions of argument structure

Computational Linguistics
Automatic extraction of subcategorization from corpora

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Comlex Syntax: building a computational lexicon

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Using a probabilistic class-based lexicon for lexical ambiguity resolution

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Clustering verbs semantically according to their alternation behaviour

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Investigations into the role of lexical semantics in word sense disambiguation

Investigations into the role of lexical semantics in word sense disambiguation
Experiments on the Automatic Induction of German Semantic Verb Classes

Computational Linguistics
A high-performance semi-supervised learning method for text chunking

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Automatic classification of verbs in biomedical texts

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A general feature space for automatic verb classification

Natural Language Engineering
Putting pieces together: combining FrameNet, VerbNet and WordNet for robust semantic parsing

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing

The choice of features for classification of verbs in biomedical texts

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Semantic classification with distributional kernels

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Automatic fine-grained semantic classification for domain adaptation

STEP '08 Proceedings of the 2008 Conference on Semantics in Text Processing
Improving verb clustering with automatically acquired selectional preferences

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Unsupervised and constrained Dirichlet process mixture models for verb clustering

GEMS '09 Proceedings of the Workshop on Geometrical Models of Natural Language Semantics
Towards Unrestricted, Large-Scale Acquisition of Feature-Based Conceptual Representations from Corpus Data

Research on Language and Computation
Active learning for constrained Dirichlet process mixture models

GEMS '10 Proceedings of the 2010 Workshop on GEometrical Models of Natural Language Semantics
Investigating the cross-linguistic potential of VerbNet: style classification

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Hierarchical verb clustering using graph factorization

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Evaluating the premises and results of four metaphor identification systems

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Previous research has shown that syntactic features are the most informative features in automatic verb classification. We investigate their optimal characteristics by comparing a range of feature sets extracted from data where the proportion of verbal arguments and adjuncts is controlled. The data are obtained from different versions of VALEX [1] - a large SCF lexicon for English which was acquired automatically from several corpora and theWeb.We evaluate the feature sets thoroughly using four supervised classifiers and one unsupervised method. The best performing feature set includes rich syntactic information about both arguments and adjuncts of verbs. When combined with our best performing classifier (a novel Gaussian classifier), it yields the promising accuracy of 64.2% in classifying 204 verbs to 17 Levin (1993) classes. We discuss the impact of our results on the state-or-art and propose avenues for future work.