New experiments in distributional representations of synonymy

Authors:
Dayne Freitag;Matthias Blume;John Byrnes;Edmond Chow;Sadik Kapadia;Richard Rohwer;Zhiqiang Wang
Affiliations:
HNC Software, LLC, San Diego, CA;HNC Software, LLC, San Diego, CA;HNC Software, LLC, San Diego, CA;HNC Software, LLC, San Diego, CA;HNC Software, LLC, San Diego, CA;HNC Software, LLC, San Diego, CA;HNC Software, LLC, San Diego, CA
Venue:
CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Year:
2005

Citing 6
Cited 14

Discovering word senses from text

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Using syntactic dependency as local context to resolve word sense ambiguity

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Measures of distributional similarity

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Frequency estimates for statistical word similarity measures

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Characterising measures of lexical distributional similarity

COLING '04 Proceedings of the 20th international conference on Computational Linguistics

Rank-Based Transformation in Measuring Semantic Relatedness

Canadian AI '09 Proceedings of the 22nd Canadian Conference on Artificial Intelligence: Advances in Artificial Intelligence
The XTREEM Methods for Ontology Learning from Web Documents

Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
Synonym extraction using a semantic distance on a dictionary

TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
Relieving Polysemy Problem for Synonymy Detection

EPIA '09 Proceedings of the 14th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Semantic similarity measure of polish nouns based on linguistic features

BIS'07 Proceedings of the 10th international conference on Business information systems
Automatic selection of heterogeneous syntactic features in semantic similarity of polish nouns

TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
Towards semi-automatic extraction of lexical semantics relations for Polish

International Journal of Intelligent Information and Database Systems
Discovery of numerous specific topics via term co-occurrence analysis

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Paraphrase alignment for synonym evidence discovery

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Automatic discovery of word semantic relations using paraphrase alignment and distributional lexical semantics analysis

Natural Language Engineering
Comparing distributional and mirror translation similarities for extracting synonyms

Canadian AI'11 Proceedings of the 24th Canadian conference on Advances in artificial intelligence
Enriching temporal query understanding through date identification: how to tag implicit temporal queries?

Proceedings of the 2nd Temporal Web Analytics Workshop
Lexical acquisition for clinical text mining using distributional similarity

CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
GTE: a distributional second-order co-occurrence approach to improve the identification of top relevant dates in web snippets

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent work on the problem of detecting synonymy through corpus analysis has used the Test of English as a Foreign Language (TOEFL) as a benchmark. However, this test involves as few as 80 questions, prompting questions regarding the statistical significance of reported results. We overcome this limitation by generating a TOEFL-like test using WordNet, containing thousands of questions and composed only of words occurring with sufficient corpus frequency to support sound distributional comparisons. Experiments with this test lead us to a similarity measure which significantly outperforms the best proposed to date. Analysis suggests that a strength of this measure is its relative robustness against polysemy.