Improving the performance of identifying contributors for XML keyword search

Authors:
Rung-Ren Lin;Ya-Hui Chang;Kun-Mao Chao
Affiliations:
National Taiwan University, Taipei, Taiwan;National Taiwan Ocean University, Keelung, Taiwan;National Taiwan University, Taipei, Taiwan
Venue:
ACM SIGMOD Record
Year:
2011

Citing 14
Cited 0

Storing and querying ordered XML using a relational database system

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
A Query Language for XML Based on Graph Grammars

World Wide Web
XRANK: ranked keyword search over XML documents

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Efficient keyword search for smallest LCAs in XML databases

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
XSEarch: a semantic search engine for XML

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Schema-free XQuery

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Effective keyword search for valuable lcas over xml documents

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Efficient LCA based keyword search in XML data

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Reasoning and identifying relevant matches for XML keyword search

Proceedings of the VLDB Endowment
The complexity of query containment in expressive fragments of XPath 2.0

Journal of the ACM (JACM)
Fast ELCA computation for keyword queries on XML data

Proceedings of the 13th International Conference on Extending Database Technology
LCA-based selection for XML document collections

Proceedings of the 19th international conference on World wide web
Evaluation Techniques for Generalized Path Pattern Queries on XML Data

World Wide Web
Faster algorithms for searching relevant matches in XML databases

DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Keyword search is a friendly mechanism for users to identify desired information in XML databases, and LCA is a popular concept for locating the meaningful subtrees corresponding to query keywords. Among all the LCA-based approaches, MaxMatch [9] is the only one which could achieve the property of monotonicity and consistency, by outputting only contributors instead of the whole subtree. Although the MaxMatch algorithm performs efficiently in some cases, there is still room for improvement. In this paper, we first propose to improve its performance by avoiding unnecessary index accesses. We then speed up the process of subset detection, which is a core procedure for determining contributors. The resultant algorithm is called MinMap and MinMap+, respectively. At last, we analytically and empirically demonstrate the efficiency of our methods. According to our experiments, our two algorithms work better than the existing one, and MinMap+ is particularly helpful when the breadth of the tree is large and the number of keywords grows.