Cross-lingual random indexing for information retrieval

Authors:
Hans Moen;Erwin Marsi
Affiliations:
Dept. of Computer and Information Science, Norwegian University of Science and Technology (NTNU), Trondheim, Norway;Dept. of Computer and Information Science, Norwegian University of Science and Technology (NTNU), Trondheim, Norway
Venue:
SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
Year:
2013

Citing 16
Cited 0

Using linear algebra for intelligent information retrieval

SIAM Review
A vector space model for automatic indexing

Communications of the ACM
Sparse Distributed Memory

Sparse Distributed Memory
CLEF Methodology and Metrics

CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
Identifying word translations in non-parallel texts

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Technical issues of cross-language information retrieval: a review

Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
Automatic bilingual lexicon acquisition using random indexing of parallel corpora

Natural Language Engineering
Introduction to Information Retrieval

Introduction to Information Retrieval
Methodological Review: Empirical distributional semantics: Methods and biomedical applications

Journal of Biomedical Informatics
Representing Context Information for Document Retrieval

FQAS '09 Proceedings of the 8th International Conference on Flexible Query Answering Systems
Reflective random indexing for semi-automatic indexing of the biomedical literature

Journal of Biomedical Informatics
Cross-lingual induction of selectional preferences with bilingual vector spaces

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
From frequency to meaning: vector space models of semantics

Journal of Artificial Intelligence Research
Latent semantic indexing (LSI) fails for TREC collections

ACM SIGKDD Explorations Newsletter
Exploring new languages with HAIRCUT at CLEF 2005

CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
Applying light natural language processing to ad-hoc cross language information retrieval

CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cross-lingual information retrieval aims at retrieving relevant documents from a document collection in a language different from the query language. A novel method is proposed which avoids direct translation of queries by implicit encoding of translations in a bilingual vector space model (VSM). Both queries and documents are represented as vectors using an extension of random indexing (RI). As work on RI for information retrieval is limited, it is first evaluated for monolingual retrieval. Two variants are tested: (1) a direct RI model that approximates a standard VSM; (2) an indirect RI model intended to capture latent semantic relations among terms with a sliding window procedure. Next cross-lingual extensions of these models are presented and evaluated for cross-lingual document retrieval.