A Fast Retrieval Algorithm for Large-Scale XML Data

  • Authors:
  • Hiroki Tanioka

  • Affiliations:
  • Innovative Technology R&D, JustSystems Corporation, Japan

  • Venue:
  • Focused Access to XML Documents
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a novel approach for retrieving large-scale XML data using the vector space model. The vector space model is commonly used in the information retrieval community. Last year, for the Evaluation of XML Retrieval (INEX) 2006 Adhoc Track, we developed a system using fragment elements. The system made it possible to search over XML elements for queries with varying constraints on XML elements to be included in the search, without the need for reindexing the collection, supporting more flexible queries. However the system took significant time to unitize the fragment elements. To solve the problem, our new system is composed of an inverted-file list and a relative inverted-path list on the INEX 2007 Adhoc Track corpus.