Interactive Cross-Language Document Selection

  • Authors:
  • Douglas W. Oard;Julio Gonzalo;Mark Sanderson;Fernando López-Ostenero;Jianqiang Wang

  • Affiliations:
  • Human-Computer Interaction Laboratory, College of Information Studies and Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA. oard@umd.edujulio@lsi.u ...;Department of Information Studies, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK. m.sanderson@sheffield.ac.uk;Departamento de Lenguajes y Sistemas Informáticos, Universidad Nacional de Educación a Distancia, E.T.S.I Industriales, Ciudad Universitaria s/n, 28040 Madrid, Spain. flopez@lsi. ...;College of Information Studies, University of Maryland, College Park, MD 20742, USA. wangjq@glue.umd.edu

  • Venue:
  • Information Retrieval
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of finding documents written in a language that the searcher cannot read is perhaps the most challenging application of cross-language information retrieval technology. In interactive applications, that task involves at least two steps: (1) the machine locates promising documents in a collection that is larger than the searcher could scan, and (2) the searcher recognizes documents relevant to their intended use from among those nominated by the machine. This article presents the results of experiments designed to explore three techniques for supporting interactive relevance assessment: (1) full machine translation, (2) rapid term-by-term translation, and (3) focused phrase translation. Machine translation was found to better support this task than term-by-term translation, and focused phrase translation further improved recall without an adverse effect on precision. The article concludes with an assessment of the strengths and weaknesses of the evaluation framework used in this study and some remarks on implications of these results for future evaluation campaigns.