An architecture for xml information retrieval in a peer-to-peer environment

  • Authors:
  • Judith Winter;Oswald Drobnik

  • Affiliations:
  • J. W. Goethe-University, Frankfurt/Main, Germany;J. W. Goethe-University, Frankfurt/Main, Germany

  • Venue:
  • Proceedings of the ACM first Ph.D. workshop in CIKM
  • Year:
  • 2007

Quantified Score

Hi-index 0.01

Visualization

Abstract

XML has become a widely accepted standard for modelling, storing, and exchanging structured documents. Taking advantage of the document structure can result in improving the retrieval performance of XML-documents notably. A growing number of these documents are stored in Peer-to-Peer networks, which are promising self-organizing infrastructures. Documents are distributed over the Peer-to-Peer network by either being stored locally on individual peers or by being assigned to collections such as Digital Libraries. Current search methods for XML-documents in Peer-to-Peer networks lack the use of Information Retrieval techniques for vague queries and relevance detection. Our work aims for the development of a search engine for XML-documents, where Information Retrieval methods are enhanced by using structural information. Documents and global index are distributed over a Peer-to-Peer Network, building a virtually unlimited storage space. In this paper, a conceptual architecture for XML Information Retrieval in Peer-to-Peer networks is proposed. Based on this general architecture, a component-structured architecture for a concrete search engine is presented, which uses an extension of the Vector Space Model to compute relevance for dynamic XML-documents.