Multilingual lexical database generation from parallel texts in 20 European languages with endogenous resources

  • Authors:
  • Giguet Emmanuel;LUQUET Pierre-Sylvain

  • Affiliations:
  • Université de Caen, Caen Cedex - France;Université de Caen, Caen Cedex - France

  • Venue:
  • COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper deals with multilingual database generation from parallel corpora. The idea is to contribute to the enrichment of lexical databases for languages with few linguistic resources. Our approach is endogenous: it relies on the raw texts only, it does not require external linguistic resources such as stemmers or taggers. The system produces alignments for the 20 European languages of the 'Acquis Communautaire' Corpus.