Enrichment of text documents using information retrieval techniques in a distributed environment

  • Authors:
  • Francisco Bueno;Ana García-Serrano;José L. Martínez-Fernández

  • Affiliations:
  • Facultad de Informática, Universidad Politécnica de Madrid, Spain;ETSI Informática, Universidad Nacional de Educación a Distancia (UNED), Madrid, Spain;Departamento de Informática, Universidad Carlos III de Madrid, Spain

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2010

Quantified Score

Hi-index 12.05

Visualization

Abstract

The main goal of the paper is to describe a distributed information retrieval model deployed in order to enable the different functionalities needed for the enrichment of a document. Enriching a document here means finding, in a distributed environment, most of the documents related to it. Moreover, the environment is in a context in which documents are news, which may arrive to the system at any time, and the response time is critical. We first define the architecture to be deployed, designed with the aim of testing the effect of different combination approaches for selecting and ranking a set of documents in a continuously changing environment. Then we discuss the different techniques that can be used in the approach. Finally, we describe a prototype version of the developed software, previously settled in EU project NEDINE (e-Content 2225), using Ciao and taking advantage of its features for the development of distributed systems, using also Java for interfacing the system.