An Empirical Bayesian Method for Detecting Out of Context Words

Authors:
Sanaz Jabbari;Ben Allison;Louise Guthrie
Affiliations:
Natural Language Processing Group, Department of Computer Science, University of Sheffield, UK;Natural Language Processing Group, Department of Computer Science, University of Sheffield, UK;Natural Language Processing Group, Department of Computer Science, University of Sheffield, UK
Venue:
TSD '08 Proceedings of the 11th international conference on Text, Speech and Dialogue
Year:
2008

Citing 5
Cited 0

The generative lexicon

Computational Linguistics
Similarity-based approaches to natural language processing

Similarity-based approaches to natural language processing
Automatic word sense discrimination

Computational Linguistics - Special issue on word sense disambiguation
Distributional similarity models: clustering vs. nearest neighbors

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Dependency-Based Construction of Semantic Space Models

Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose an empirical Bayesian method for determining whether a word is used out of context. We suggest we can treat a word's context as a multinomially distributed random variable, and this leads us to a simple and direct Bayesian hypothesis test for the problem in question. We demonstrate this method to be superior to a method based upon common practice in the literature. We also demonstrate how an empirical Bayes method, whereby we use the behaviour of other words to specify a prior distribution on model parameters, improves performance by an appreciable amount where training data is sparse.