Semi-supervised verb class discovery using noisy features

Authors:
Suzanne Stevenson;Eric Joanis
Affiliations:
University of Toronto;University of Toronto
Venue:
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Year:
2003

Citing 11
Cited 12

Induction of Decision Trees

Machine Learning
Automatic labeling of semantic roles

Computational Linguistics
Class-Based Construction of a Verb Lexicon

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Dimensionality Reduction of Unsupervised Data

ICTAI '97 Proceedings of the 9th International Conference on Tools with Artificial Intelligence
Automatic verb classification based on statistical distributions of argument structure

Computational Linguistics
Automatic verb classification using distributions of grammatical features

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Role of word sense disambiguation in lexical acquisition: predicting semantics from syntactic cues

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
A general feature space for automatic verb classification

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Experiments on the choice of features for learning verb classes

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Learning verb argument structure from minimally annotated corpora

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Inducing German semantic verb classes from purely syntactic subcategorisation information

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics

Clustering polysemic subcategorization frame distributions semantically

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Experiments on the Automatic Induction of German Semantic Verb Classes

Computational Linguistics
Towards a semantic classification of Spanish verbs based on subcategorisation information

ACLstudent '04 Proceedings of the ACL 2004 workshop on Student research
A general feature space for automatic verb classification

Natural Language Engineering
Clustering Hungarian verbs on the basis of complementation patterns

ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
Can human verb associations help identify salient features for semantic verb classification?

CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Empirical evaluations of animacy annotation

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
SemEval'07 task 19: frame semantic structure extraction

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Improving verb clustering with automatically acquired selectional preferences

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Supervised learning of a probabilistic lexicon of verb semantic classes

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Polysemous verb classification using subcategorization acquisition and graph-based clustering

LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
Hierarchical verb clustering using graph factorization

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We cluster verbs into lexical semantic classes, using a general set of noisy features that capture syntactic and semantic properties of the verbs. The feature set was previously shown to work well in a supervised learning setting, using known English verb classes. In moving to a scenario of verb class discovery, using clustering, we face the problem of having a large number of irrelevant features for a particular clustering task. We investigate various approaches to feature selection, using both unsupervised and semi-supervised methods, comparing the results to subsets of features manually chosen according to linguistic properties. We find that the unsupervised method we tried cannot be consistently applied to our data. However, the semi-supervised approach (using a seed set of sample verbs) overall outperforms not only the full set of features, but the hand-selected features as well.