A comparable corpus based on aligned multilingual ontologies

  • Authors:
  • Roger Granada;Lucelene Lopes;Carlos Ramisch;Cassia Trojahn;Renata Vieira;Aline Villavicencio

  • Affiliations:
  • PUCRS (Brazil);PUCRS (Brazil);University of Grenoble (France);University of Grenoble (France);PUCRS (Brazil);UFRGS (Brazil)

  • Venue:
  • MM '12 Proceedings of the First Workshop on Multilingual Modeling
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present a methodology for building comparable corpus, using multilingual ontologies of a scpecific domain. This resource can be exploited to foster research on multilingual corpus-based ontology learning, population and matching. The building resource process is exemplified by the construction of annotated comparable corpora in English, Portuguese, and French. The corpora, from the conference organization domain, are built using the multilingual ontology concept labels as seeds for crawling relevant documents from the web through a search engine. Using ontologies allows a better coverage of the domain. The main goal of this paper is to describe the design methodology followed by the creation of the corpora. We present a preliminary evaluation and discuss their characteristics and potential applications.