Comparing Different Properties Involved in Word Similarity Extraction

Authors:
Pablo Gamallo Otero
Affiliations:
Universidade de Santiago de Compostela, Galiza, Spain
Venue:
EPIA '09 Proceedings of the 14th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Year:
2009

Citing 13
Cited 1

Explorations in Automatic Thesaurus Discovery

Explorations in Automatic Thesaurus Discovery
Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Extracting word correspondences from bilingual corpora based on word co-occurrences information

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Automatic identification of word translations from unrelated English and German corpora

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Improvements in automatic thesaurus extraction

ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Clustering Syntactic Positions with Similar Semantic Requirements

Computational Linguistics
Navigation in degree of interest trees

Proceedings of the working conference on Advanced visual interfaces
Scaling distributional similarity to large corpora

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Accurate collocation extraction using a multilingual parser

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Dependency-Based Construction of Semantic Space Models

Computational Linguistics
An efficient algorithm for building a distributional thesaurus (and other Sketch Engine developments)

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
A comparison of co-occurrence and similarity measures as simulations of context

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing

Is singular value decomposition useful for word similarity extraction?

Language Resources and Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we will analyze the behavior of several parameters, namely type of contexts, similarity measures, and word space models, in the task of word similarity extraction from large corpora. The main objective of the paper will be to describe experiments comparing different extraction systems based on all possible combinations of these parameters. Special attention will be paid to the comparison between syntax-based contexts and windowing techniques, binary similarity metrics and more elaborate coefficients, as well as baseline word space models and Singular Value Decomposition strategies. The evaluation leads us to conclude that the combination of syntax-based contexts, binary similarity metrics, and a baseline word space model makes the extraction much more precise than other combinations with more elaborate metrics and complex models.