Cross-lingual comparison between distributionally determined word similarity networks

Authors:
Olof Görnerup;Jussi Karlgren
Affiliations:
Swedish Institute of Computer Science (SICS), Kista, Sweden;Swedish Institute of Computer Science (SICS), Kista, Sweden
Venue:
TextGraphs-5 Proceedings of the 2010 Workshop on Graph-based Methods for Natural Language Processing
Year:
2010

Citing 4
Cited 0

Large-scale information retrieval with latent semantic indexing

Information Sciences: an International Journal
A graph model for unsupervised lexical acquisition

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Management of keyword variation with frequency based generation of word forms in IR

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Chinese whispers: an efficient graph clustering algorithm and its application to natural language processing problems

TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

As an initial effort to identify universal and language-specific factors that influence the behavior of distributional models, we have formulated a distributionally determined word similarity network model, implemented it for eleven different languages, and compared the resulting networks. In the model, vertices constitute words and two words are linked if they occur in similar contexts. The model is found to capture clear isomorphisms across languages in terms of syntactic and semantic classes, as well as functional categories of abstract discourse markers. Language specific morphology is found to be a dominating factor for the accuracy of the model.