A Fast Retrieval Algorithm for Large-Scale XML Data

Authors:
Hiroki Tanioka
Affiliations:
Innovative Technology R&D, JustSystems Corporation, Japan
Venue:
Focused Access to XML Documents
Year:
2008

Citing 6
Cited 0

BUS: an effective indexing and retrieval scheme in structured documents

Proceedings of the third ACM conference on Digital libraries
A vector space model for automatic indexing

Communications of the ACM
Modern Information Retrieval

Modern Information Retrieval
Content and structure in indexing and ranking XML

Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
XML search: languages, INEX and scoring

ACM SIGMOD Record
INEX 2005 evaluation measures

INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a novel approach for retrieving large-scale XML data using the vector space model. The vector space model is commonly used in the information retrieval community. Last year, for the Evaluation of XML Retrieval (INEX) 2006 Adhoc Track, we developed a system using fragment elements. The system made it possible to search over XML elements for queries with varying constraints on XML elements to be included in the search, without the need for reindexing the collection, supporting more flexible queries. However the system took significant time to unitize the fragment elements. To solve the problem, our new system is composed of an inverted-file list and a relative inverted-path list on the INEX 2007 Adhoc Track corpus.