Cross-lingual C*ST*RD: English access to Hindi information

  • Authors:
  • Anton Leuski;Chin-Yew Lin;Liang Zhou;Ulrich Germann;Franz Josef Och;Eduard Hovy

  • Affiliations:
  • Information Sciences Institute, University of Southern California;Information Sciences Institute, University of Southern California;Information Sciences Institute, University of Southern California;Information Sciences Institute, University of Southern California;Information Sciences Institute, University of Southern California;Information Sciences Institute, University of Southern California

  • Venue:
  • ACM Transactions on Asian Language Information Processing (TALIP)
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present C*ST*RD, a cross-language information delivery system that supports cross-language information retrieval, information space visualization and navigation, machine translation, and text summarization of single documents and clusters of documents. C*ST*RD was assembled and trained within 1 month, in the context of DARPA's Surprise Language Exercise, that selected as source a heretofore unstudied language, Hindi. Given the brief time, we could not create deep Hindi capabilities for all the modules, but instead experimented with combining shallow Hindi capabilities, or even English-only modules, into one integrated system. Various possible configurations, with different tradeoffs in processing speed and ease of use, enable the rapid deployment of C*ST*RD to new languages under various conditions.