A look at some issues during textual linking of homogeneous web repositories

  • Authors:
  • José Antonio Camacho-Guerrero;Alessandra Alaniz Macedo;Maria da Graça Campos Pimentel

  • Affiliations:
  • Universidade de São Paulo São, Carlos, Brazil;Universidade de São Paulo Ribeirão, Preto, Brazil;Universidade de São Paulo São, Carlos, Brazil

  • Venue:
  • Proceedings of the 2004 ACM symposium on Document engineering
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Interacting with services that create links automatically via Web users are able to identify relationships among documents stored in different repositories. The fact that automatic linking services do not use queries performed by a human user has impact in the use of information retrieval techniques for the identification of relationships. Information retrieval techniques can lead to the identification of relationships that should not have been generated (generating non-relevant links) at the same time that fail to identify all relevant relationships (poor recall). Towards improving the quality of the relationships identified we have investigated some design issues considered during the automatic linking of textual repositories. The investigations have used a collection of documents from online Brazilian Newspapers and the Cystic Fibrosis Collection. The results of the investigations have defined procedures infrastructures and consequently the requirements for a configurable linking service made also available as a contribution of this work.