Compression of Multilingual Aligned Texts

  • Authors:
  • Ehud S. Conley;Shmuel T. Klein

  • Affiliations:
  • Bar-Ilan University, Israel;Bar-Ilan Univiversity, Israel

  • Venue:
  • DCC '06 Proceedings of the Data Compression Conference
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In countries like Canada, Belgium and Switzerland, where speakers of two or more languages live side-by-side, all official texts have to be published in multilingual form. Similarly, all official texts of the European Union are translated into the languages of all member states. As a result, there is a growing corpus of important texts, large parts of which are highly redundant, since they do not have any information content of their own. Rather, they are just transformed copies of some other parts of the text collection.