ATLAS: a new text alignment architecture

  • Authors:
  • Bettina Schrader

  • Affiliations:
  • University of Osnabrück, Osnabrück

  • Venue:
  • COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We are presenting a new, hybrid alignment architecture for aligning bilingual, linguistically annotated parallel corpora. It is able to align simultaneously at paragraph, sentence, phrase and word level, using statistical and heuristic cues, along with linguistics-based rules. The system currently aligns English and German texts, and the linguistic annotation used covers POS-tags, lemmas and syntactic constitutents. However, as the system is highly modular, we can easily adapt it to new language pairs and other types of annotation. The hybrid nature of the system allows experiments with a variety of alignment cues to find solutions to word alignment problems like the correct alignment of rare words and multiwords, or how to align despite syntactic differences between two languages. First performance tests are promising, and we are setting up a gold standard for a thorough evaluation of the system.