Word sense disambiguation and information retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Choosing the word most typical in context using a lexical co-occurrence network
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Not So Naive Bayes: Aggregating One-Dependence Estimators
Machine Learning
A graph model for unsupervised lexical acquisition
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Word Sense Disambiguation: Algorithms and Applications (Text, Speech and Language Technology)
Word Sense Disambiguation: Algorithms and Applications (Text, Speech and Language Technology)
Domain kernels for word sense disambiguation
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Differentiating homonymy and polysemy in information retrieval
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Word Sense Induction Using Graphs of Collocations
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
The linguistic structure of English web-search queries
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
SemEval-2007 task 10: English lexical substitution task
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
SemEval-2007 task 17: English lexical sample, SRL and all words
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
NUS-ML: improving word sense disambiguation using topic features
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
UBC-ALM: combining k-NN with SVD for WSD
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
On the use of automatically acquired examples for all-nouns word sense disambiguation
Journal of Artificial Intelligence Research
TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
Evaluating and optimizing the parameters of an unsupervised graph-based WSD algorithm
TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Investigations on word senses and word usages
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Co-occurrence cluster features for lexical substitutions in context
TextGraphs-5 Proceedings of the 2010 Workshop on Graph-based Methods for Natural Language Processing
Building structures from classifiers for passage reranking
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
This article describes the creation and application of the Turk Bootstrap Word Sense Inventory for 397 frequent nouns, which is a publicly available resource for lexical substitution. This resource was acquired using Amazon Mechanical Turk. In a bootstrapping process with massive collaborative input, substitutions for target words in context are elicited and clustered by sense; then, more contexts are collected. Contexts that cannot be assigned to a current target word's sense inventory re-enter the bootstrapping loop and get a supply of substitutions. This process yields a sense inventory with its granularity determined by substitutions as opposed to psychologically motivated concepts. It comes with a large number of sense-annotated target word contexts. Evaluation on data quality shows that the process is robust against noise from the crowd, produces a less fine-grained inventory than WordNet and provides a rich body of high precision substitution data at low cost. Using the data to train a system for lexical substitutions, we show that amount and quality of the data is sufficient for producing high quality substitutions automatically. In this system, co-occurrence cluster features are employed as a means to cheaply model topicality.