Effective XML content and structure retrieval with relevance ranking

  • Authors:
  • Xiping Liu;Changxuan Wan;Lei Chen

  • Affiliations:
  • Jiangxi University of Finance and Economics, Nanchang, China;Jiangxi University of Finance and Economics, Nanchang, China;Hong Kong University of Science and Technology, Hong Kong, Hong Kong

  • Venue:
  • Proceedings of the 18th ACM conference on Information and knowledge management
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML documents can be retrieved by means of not only content-only (CO) queries, but also content-and-structure (CAS) queries. Though promising better retrieval precision, CAS queries introduce several new challenges. To address these challenges, we propose a novel approach for XML CAS retrieval. The distinctive feature of the approach is that it adopts a content-oriented point of view. Specifically, the approach first decomposes a CAS query into several fragments, then retrieves results for each query fragment in a content-centric way, and finally scores each answer node. The approach is adaptive to versatile homogeneous and heterogeneous data environments. To assess the relevance of retrieval results to a query fragment, we present a scoring strategy that measures relevance from both content and structure perspectives. In addition, an effective approach is proposed to infer answer nodes based on the CAS query and document structure. An efficient algorithm is also presented for CAS retrieval. Finally, we demonstrate the effectiveness of the proposed methods through comprehensive experimental studies.