Similarity-Based Models of Word Cooccurrence Probabilities
Machine Learning - Special issue on natural language learning
Using the web to obtain frequencies for unseen bigrams
Computational Linguistics - Special issue on web as corpus
Determinants of adjective-noun plausibility
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
More accurate tests for the statistical significance of result differences
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Inducing a semantically annotated lexicon via EM-based clustering
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Discriminative learning of selectional preference from unlabeled text
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Generalizing over lexical features: selectional preferences for semantic role classification
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Semantic relations in bilingual lexicons
ACM Transactions on Speech and Language Processing (TSLP)
Measuring the impact of sense similarity on word sense induction
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Sketch algorithms for estimating point queries in NLP
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
This paper improves the use of pseudo-words as an evaluation framework for selectional preferences. While pseudo-words originally evaluated word sense disambiguation, they are now commonly used to evaluate selectional preferences. A selectional preference model ranks a set of possible arguments for a verb by their semantic fit to the verb. Pseudo-words serve as a proxy evaluation for these decisions. The evaluation takes an argument of a verb like drive (e.g. car), pairs it with an alternative word (e.g. car/rock), and asks a model to identify the original. This paper studies two main aspects of pseudoword creation that affect performance results. (1) Pseudo-word evaluations often evaluate only a subset of the words. We show that selectional preferences should instead be evaluated on the data in its entirety. (2) Different approaches to selecting partner words can produce overly optimistic evaluations. We offer suggestions to address these factors and present a simple baseline that outperforms the state-of-the-art by 13% absolute on a newspaper domain.