No tag, a little nesting, and great XML keyword search

  • Authors:
  • Lingbo Kong;Shiwei Tang;Dongqing Yang;Tengjiao Wang;Jun Gao

  • Affiliations:
  • Department of Computer Science and Technology, Peking University, Beijing, China;Department of Computer Science and Technology, Peking University, Beijing, China;Department of Computer Science and Technology, Peking University, Beijing, China;Department of Computer Science and Technology, Peking University, Beijing, China;Department of Computer Science and Technology, Peking University, Beijing, China

  • Venue:
  • AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Keyword search from Informational Retrieval (IR) can be seen as one most convenient processing mode catering for common users to obtain interesting information. As XML data becomes more and more widespread, the trend of adapting keyword search on XML data also becomes more and more active. In this paper, we first try nesting mechanism for XML keyword search, which just uses a little nesting skill. This attempt has several benefits. For example, it is convenient for common users, because they need not to know any organization knowledge of the target XML data. Secondly, the nesting pattern can be easily transformed into structural hints, which has same mechanism as what XML data model does. Finally, since there is no need of label information, we can retrieve XML fragments from different schemas. Besides, this paper also proposes a new similarity measuring method for retrieved XML fragments which can be from different schemas. Its kernel is KCAM (Keyword Common Ancestor Matrix) structure, which stores the level information of SLCA (Smallest Lowest Common Ancestor) node between two keywords. By mapping XML fragments into KCAMs, the structural similarity can be computed using matrix distance. KCAM distance can go well with the nesting keyword method.