Evaluation of alignment methods for HTML parallel text

  • Authors:
  • Enrique Sánchez-Villamil;Susana Santos-Antón;Sergio Ortiz-Rojas;Mikel L. Forcada

  • Affiliations:
  • Transducens group, Departament de Llenguatges i Sistemes Informàtics, Universitat d’Alacant, Alacant, Spain;Transducens group, Departament de Llenguatges i Sistemes Informàtics, Universitat d’Alacant, Alacant, Spain;Transducens group, Departament de Llenguatges i Sistemes Informàtics, Universitat d’Alacant, Alacant, Spain;Transducens group, Departament de Llenguatges i Sistemes Informàtics, Universitat d’Alacant, Alacant, Spain

  • Venue:
  • FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Internet constitutes a potential huge store of parallel text that may be collected to be exploited by many applications such as multilingual information retrieval, machine translation, etc. These applications usually require at least sentence-aligned bilingual text. This paper presents new aligners designed for improving the performance of classical sentence-level aligners while aligning structured text such as HTML. The new aligners are compared with other well-known geometric aligners.