Comparing Window and Syntax Based Strategies for Semantic Extraction

Authors:
Pablo Gamallo Otero
Affiliations:
Departamento de Língua Espanhola, Faculdade de Filologia, Universidade de Santiago de Compostela, Galiza, Spain
Venue:
PROPOR '08 Proceedings of the 8th international conference on Computational Processing of the Portuguese Language
Year:
2008

Citing 7
Cited 0

Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Automatic identification of word translations from unrelated English and German corpora

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Improvements in automatic thesaurus extraction

ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Clustering Syntactic Positions with Similar Semantic Requirements

Computational Linguistics
Accurate collocation extraction using a multilingual parser

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Dependency-Based Construction of Semantic Space Models

Computational Linguistics
Fips, a "deep" linguistic multilingual parser

DeepLP '07 Proceedings of the Workshop on Deep Linguistic Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we describe and compare two different approaches for extracting similar words from large corpora. In particular, we compared a method based on syntactic contexts with two strategies relying on windows of tagged words, one using word order and the other bags of words. On a Portuguese corpus of 12 million words, syntactic contexts produce significantly better results for both frequent and not very frequent words.