DISPARA, a System for Distributing Parallel Corpora on the Web

  • Authors:
  • Diana Santos

  • Affiliations:
  • -

  • Venue:
  • PorTAL '02 Proceedings of the Third International Conference on Advances in Natural Language Processing
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

The main purpose of the present paper is to document the process of creating a parallel corpus available on the Web, thereby illuminating technical and design issues involved in such a project. By this we hope to gather more researchers to help with the building process, as well as boast considerably the number of users of the parallel corpus.We start by noting that resource creation is far from a trivial process, and proceed by providing a brief introduction to COMPARA, the particular parallel corpus in connection with which the present system was developed, although with a view to achieve a general architecture. In the following sections we describe the incremental building process in DISPARA, emphasizing the reuse of software components, and discuss the Web interface. We conclude by discussing remaining work and emphasizing the importance of user feedback.