Ranking multilingual documents using minimal language dependent resources

Authors:
G. S. K. Santosh;N. Kiran Kumar;Vasudeva Varma
Affiliations:
International Institute of Information Technology, Hyderabad, India;International Institute of Information Technology, Hyderabad, India;International Institute of Information Technology, Hyderabad, India
Venue:
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
Year:
2011

Citing 7
Cited 1

The impact of database selection on distributed searching

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A merging strategy proposal: The 2-step retrieval status value method

Information Retrieval
Autonomously semantifying wikipedia

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A study of learning a merge model for multilingual information retrieval

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Joint Ranking for Multilingual Web Search

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Probabilistic models for answer-ranking in multilingual question-answering

ACM Transactions on Information Systems (TOIS)
Selection and merging strategies for multilingual information retrieval

CLEF'04 Proceedings of the 5th conference on Cross-Language Evaluation Forum: multilingual Information Access for Text, Speech and Images

A language-independent approach to identify the named entities in under-resourced languages and clustering multilingual documents

CLEF'11 Proceedings of the Second international conference on Multilingual and multimodal information access evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes an approach of extracting simple and effective features that enhances multilingual document ranking (MLDR). There is limited prior research on capturing the concept of multilingual document similarity in determining the ranking of documents. However, the literature available has worked heavily with language specific tools, making them hard to reimplement for other languages. Our approach extracts various multilingual and monolingual similarity features using a basic language resource (bilingual dictionary). No language-specific tools are used, hence making this approach extensible for other languages. We used the datasets provided by Forum for Information Retrieval Evaluation (FIRE) for their 2010 Adhoc Cross-Lingual document retrieval task on Indian languages. Experiments have been performed with different ranking algorithms and their results are compared. The results obtained showcase the effectiveness of the features considered in enhancing multilingual document ranking.