TIPS: a translingual information processing system

  • Authors:
  • Y. Al-Onaizan;R. Florian;M. Franz;H. Hassan;Y. S. Lee;S. McCarley;K. Papineni;S. Roukos;J. Sorensen;C. Tillmann;T. Ward;F. Xia

  • Affiliations:
  • IBM T. J. Watson Research Center, Yorktown Heights;IBM T. J. Watson Research Center, Yorktown Heights;IBM T. J. Watson Research Center, Yorktown Heights;IBM T. J. Watson Research Center, Yorktown Heights;IBM T. J. Watson Research Center, Yorktown Heights;IBM T. J. Watson Research Center, Yorktown Heights;IBM T. J. Watson Research Center, Yorktown Heights;IBM T. J. Watson Research Center, Yorktown Heights;IBM T. J. Watson Research Center, Yorktown Heights;IBM T. J. Watson Research Center, Yorktown Heights;IBM T. J. Watson Research Center, Yorktown Heights;IBM T. J. Watson Research Center, Yorktown Heights

  • Venue:
  • NAACL-Demonstrations '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Demonstrations - Volume 4
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Searching online information is increasingly a daily activity for many people. The multilinguality of online content is also increasing (e.g. the proportion of English web users, which has been decreasing as a fraction the increasing population of web users, dipped below 50% in the summer of 2001). To improve the ability of an English speaker to search mutlilingual content, we built a system that supports cross-lingual search of an Arabic newswire collection and provides on demand translation of Arabic web pages into English. The cross-lingual search engine supports a fast search capability (sub-second response for typical queries) and achieves state-of-the-art performance in the high precision region of the result list. The on demand statistical machine translation uses the Direct Translation model along with a novel statistical Arabic Morphological Analyzer to yield state-of-the-art translation quality. The on demand SMT uses an efficient dynamic programming decoder that achieves reasonable speed for translating web documents.