An original semantics to keyword queries for XML using structural patterns

  • Authors:
  • Dimitri Theodoratos;Xiaoying Wu

  • Affiliations:
  • Department of Computer Science, New Jersey Institute of Technology;Department of Computer Science, New Jersey Institute of Technology

  • Venue:
  • DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML is by now the de facto standard for exporting and exchanging data on the web. The need for querying XML data sources whose structure is not fully known to the user and the need to integrate multiple data sources with different tree structures have motivated recently the suggestion of keyword-based techniques for querying XML documents. The semantics adopted by these approaches aims at restricting the answers to meaningful ones. However, these approaches suffer from low precision, while recent ones with improved precision suffer from low recall. In this paper, we introduce an original approach for assigning semantics to keyword queries for XML documents. We exploit index graphs (a structural summary of data) to extract tree patterns that return meaningful answers. In contrast to previous approaches that operate locally on the data to compute meaningful answers (usually by computing lowest common ancestors), our approach operates globally on index graphs to detect and exploit meaningful tree patterns. We implemented and experimentally evaluated our approach on DBLP-based data sets with irregularities. Its comparison to previous ones shows that it succeeds in finding all the meaningful answers when the others fail (perfect recall). Further, it outperforms approaches with similar recall in excluding meaningless answers (better precision). Since our approach is based on tree-pattern query evaluation, it can be easily implemented on top of an XQuery engine.