Large-scale learning of word relatedness with constraints

Authors:
Guy Halawi;Gideon Dror;Evgeniy Gabrilovich;Yehuda Koren
Affiliations:
Tel Aviv University, Tel Aviv, Israel;Yahoo! Research, Haifa, Israel;Yahoo! Research, Santa Clara, CA, USA;Yahoo! Research, Haifa, Israel
Venue:
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2012

Citing 17
Cited 1

Generalized vector spaces model in information retrieval

SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Similarity-Based Models of Word Cooccurrence Probabilities

Machine Learning - Special issue on natural language learning
Placing search in context: the concept revisited

ACM Transactions on Information Systems (TOIS)
Introduction to Stochastic Search and Optimization

Introduction to Stochastic Search and Optimization
Latent dirichlet allocation

The Journal of Machine Learning Research
Measures of distributional similarity

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
The SMART Retrieval System—Experiments in Automatic Document Processing

The SMART Retrieval System—Experiments in Automatic Document Processing
Evaluating WordNet-based Measures of Lexical Semantic Relatedness

Computational Linguistics
Text segmentation with LDA-based Fisher kernel

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Using wiktionary for computing semantic relatedness

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Wikipedia-based semantic interpretation for natural language processing

Journal of Artificial Intelligence Research
WikiWalk: random walks on Wikipedia for semantic relatedness

TextGraphs-4 Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing
Wisdom of crowds versus wisdom of linguists – measuring the semantic relatedness of words

Natural Language Engineering
Learning 5000 relational extractors

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A word at a time: computing word relatedness using temporal semantic analysis

Proceedings of the 20th international conference on World wide web
To each his own: personalized content selection based on text comprehensibility

Proceedings of the fifth ACM international conference on Web search and data mining

Combining latent factor model with location features for event-based group recommendation

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Prior work on computing semantic relatedness of words focused on representing their meaning in isolation, effectively disregarding inter-word affinities. We propose a large-scale data mining approach to learning word-word relatedness, where known pairs of related words impose constraints on the learning process. We learn for each word a low-dimensional representation, which strives to maximize the likelihood of a word given the contexts in which it appears. Our method, called CLEAR, is shown to significantly outperform previously published approaches. The proposed method is based on first principles, and is generic enough to exploit diverse types of text corpora, while having the flexibility to impose constraints on the derived word similarities. We also make publicly available a new labeled dataset for evaluating word relatedness algorithms, which we believe to be the largest such dataset to date.