Exploiting aligned parallel corpora in multilingual studies and applications

  • Authors:
  • Dan Tufis

  • Affiliations:
  • Research Institute for Artificial Intelligence, Romanian Academy, Bucharest, Romania

  • Venue:
  • IWIC'07 Proceedings of the 1st international conference on Intercultural collaboration
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Parallel corpora encode extremely valuable linguistic knowledge, the revealing of which is facilitated by the recent advances in multilingual corpus linguistics. The linguistic decisions made by the human translators in order to faithfully convey the meaning of the source text can be traced and used as evidence on linguistic facts which, in a monolingual context, might be unavailable to (or overlooked by) a computer program. Multilingual technologies, which to a large extent are language independent, provide a powerful support for systematic and consistent cross-lingual studies and allow for easier building of annotated linguistic resources for languages where such resources are scarce or missing. In this paper we will briefly present some underlying multilingual technologies and methodologies we developed for exploiting parallel corpora and we will discuss their relevance for crosslinguistic studies and applications.