Evaluation of the bible as a resource for cross-language information retrieval

  • Authors:
  • Peter A. Chew;Steve J. Verzi;Travis L. Bauer;Jonathan T. McClain

  • Affiliations:
  • Sandia National Laboratories, Albuquerque, NM;Sandia National Laboratories, Albuquerque, NM;Sandia National Laboratories, Albuquerque, NM;Sandia National Laboratories, Albuquerque, NM

  • Venue:
  • MLRI '06 Proceedings of the Workshop on Multilingual Language Resources and Interoperability
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

An area of recent interest in cross-language information retrieval (CLIR) is the question of which parallel corpora might be best suited to tasks in CLIR, or even to what extent parallel corpora can be obtained or are necessary. One proposal, which in our opinion has been somewhat overlooked, is that the Bible holds a unique value as a multilingual corpus, being (among other things) widely available in a broad range of languages and having a high coverage of modern-day vocabulary. In this paper, we test empirically whether this claim is justified through a series of validation tests on various information retrieval tasks. Our results appear to indicate that our methodology may significantly outperform others recently proposed.