Building a Spanish-Portuguese parallel corpus for statistical machine translation

  • Authors:
  • Wilker F. Aziz;Thiago A. S. Pardo;Ivandré Paraboni

  • Affiliations:
  • Universidade de São Paulo, São Carlos, Brazil;Universidade de São Paulo, São Carlos, Brazil;Universidade de São Paulo, São Paulo, Brazil

  • Venue:
  • Companion Proceedings of the XIV Brazilian Symposium on Multimedia and the Web
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Parallel corpora have long been recognised as valuable resources for building MT applications, but their usefulness have often been limited to the translation between language pairs that include English. In this work we describe our efforts to build a parallel corpus for the Brazilian Portuguese and European Spanish languages. The corpus has been aligned at sentence and word levels and manually inspected for correctness, representing a first step towards the development of translation models for this language pair.