Ensemble approach for cross language information retrieval

  • Authors:
  • Dinesh Mavaluru;R. Shriram;W. Aisha Banu

  • Affiliations:
  • School of Computer and Information Sciences, B.S. Abdur Rahman University, Chennai, India;School of Computer and Information Sciences, B.S. Abdur Rahman University, Chennai, India;School of Computer and Information Sciences, B.S. Abdur Rahman University, Chennai, India

  • Venue:
  • CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cross language information retrieval (CLIR) is a sub field of information retrieval (IR) which deals with retrieval of content from one language (source language) for a search query expressed in another language (target language) in the Web. Cross Language Information Retrieval evolved as a field due to the fact that majority of the content in the web is in English. Hence there is a need for dynamic translation of web content for a query expressed in the native language. The biggest problem is that of ambiguity of the query expressed in the native language. The ambiguity of languages is typically not a problem for human beings who can infer the appropriate word sense or meaning based on context, but search engines cannot usually overcome these limitations. Hence, methods and mechanisms to provide native languages access to information from the web are needed. There is a need, to not only retrieve the relevant results but also, present the content behind the results in a user understandable manner. The research in the domain has so far focused in terms of techniques that make use support vector machines, suffix tree approach, Boolean models, and iterative results clustering. This research work focuses on a methodology of personalized context based cross language information retrieval using ensemble-learning approach. The source language for this research is taken, as English and the target language is Telugu. The methodology has tested for various queries and the results are shown in this work.