Bridging the gap between technology and users: leveraging machine translation in a visual data triage tool

  • Authors:
  • Thomas Hoeft;Nick Cramer;M. L. Gregory;Elizabeth Hetzler

  • Affiliations:
  • Pacific Northwest, National Laboratory, Richland, WA;Pacific Northwest, National Laboratory, Richland, WA;Pacific Northwest, National Laboratory, Richland, WA;Pacific Northwest, National Laboratory, Richland, WA

  • Venue:
  • HLT-Demo '05 Proceedings of HLT/EMNLP on Interactive Demonstrations
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

While one of the oldest pursuits in computational linguistics (see Bar-Hillel, 1951), machine translation (MT) remains an unsolved problem. While current research has progressed a great deal, technology transfer to end users is limited. In this demo, we present a visualization tool for manipulating foreign language data. Using software developed for the exploration and understanding of large amounts of text data, IN-SPIRE (Hetzler & Turner 2004), we have developed a novel approach to mining and triaging large amounts of foreign language texts. By clustering documents in their native language and only using translations in the data triage phase, our system avoids the major pit-falls that plague modern machine translation. More generally, the visualization environment we have developed allows users to take advantage of current NLP technologies, including MT. We will demonstrate use of this tool to triage a corpus of foreign text.