Large-scale information retrieval with latent semantic indexing
Information Sciences: an International Journal
A graph model for unsupervised lexical acquisition
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Management of keyword variation with frequency based generation of word forms in IR
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
Hi-index | 0.00 |
As an initial effort to identify universal and language-specific factors that influence the behavior of distributional models, we have formulated a distributionally determined word similarity network model, implemented it for eleven different languages, and compared the resulting networks. In the model, vertices constitute words and two words are linked if they occur in similar contexts. The model is found to capture clear isomorphisms across languages in terms of syntactic and semantic classes, as well as functional categories of abstract discourse markers. Language specific morphology is found to be a dominating factor for the accuracy of the model.