Nearest keyword search in XML documents

  • Authors:
  • Yufei Tao;Stavros Papadopoulos;Cheng Sheng;Kostas Stefanidis

  • Affiliations:
  • Chinese University of Hong Kong, Hong Kong, Hong Kong;Chinese University of Hong Kong, Hong Kong, Hong Kong;Chinese University of Hong Kong, Hong Kong, Hong Kong;Chinese University of Hong Kong, Hong Kong, Hong Kong

  • Venue:
  • Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper studies the nearest keyword (NK) problem on XML documents. In general, the dataset is a tree where each node is associated with one or more keywords. Given a node q and a keyword w, an NK query returns the node that is nearest to q among all the nodes associated with w. NK search is not only useful as a stand-alone operator but also as a building brick for important tasks such as XPath query evaluation and keyword search. We present an indexing scheme that answers NK queries efficiently, in terms of both practical and worst-case performance. The query cost is provably logarithmic to the number of nodes carrying the query keyword. The proposed scheme occupies space linear to the dataset size, and can be constructed by a fast algorithm. Extensive experimentation confirms our theoretical findings, and demonstrates the effectiveness of NK retrieval as a primitive operator in XML databases.