On supporting containment queries in relational database management systems
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Storing and querying ordered XML using a relational database system
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Selectivity Estimation for XML Twigs
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Efficient keyword search for smallest LCAs in XML databases
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Multiway SLCA-based keyword search in XML data
Proceedings of the 16th international conference on World Wide Web
Identifying meaningful return information for XML keyword search
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
XMark: a benchmark for XML data management
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
XSEarch: a semantic search engine for XML
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Reasoning and identifying relevant matches for XML keyword search
Proceedings of the VLDB Endowment
Retrieving meaningful relaxed tightest fragments for XML keyword search
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Effective, design-independent XML keyword search
Proceedings of the 18th ACM conference on Information and knowledge management
Return specification inference and result clustering for keyword search on XML
ACM Transactions on Database Systems (TODS)
Suggestion of promising result types for XML keyword search
Proceedings of the 13th International Conference on Extending Database Technology
Structural consistency: enabling XML keyword search to eliminate spurious results consistently
The VLDB Journal — The International Journal on Very Large Data Bases
Towards an Effective XML Keyword Search
IEEE Transactions on Knowledge and Data Engineering
Fast SLCA and ELCA Computation for XML Keyword Queries Based on Set Intersection
ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
Hi-index | 0.00 |
Keyword search for smallest lowest common ancestors (SLCAs) in XML data has been widely accepted as a meaningful way to identify matching nodes where their subtrees contain an input set of keywords. Although SLCA and its variants (e.g.,MLCA) perform admirably in identifying matching nodes, surprisingly, they perform poorly for searches on irregular schemas that have missing elements, that is, (sub)elements that are optional, or appear in some instances of an element type but not all (e.g., a "population" subelement in a "city" element might be optional, appearing when the population is known and absent when the population is unknown). In this paper, we generalize the SLCA search paradigm to support queries involving missing elements. Specifically, we propose a novel property called optionality resilience that specifies the desired behaviors of an XML keyword search (XKS) approach for queries involving missing elements. We present two variants of a novel algorithm called MESSIAH (Missing Element-conSciouS hIgh-quality SLCA searcH), which are optionality resilient to irregular documents. MESSIAH logically transforms an XML document to a minimal full document where all missing elements are represented as empty elements, i.e., the irregular schema is made "regular", and then employs efficient strategies to identify partial and complete full SLCA nodes (SLCA nodes in the full document) from it. Specifically, it generates the same SLCA nodes as any state-of-the-art approach when the query does not involve missing elements but avoids irrelevant results when missing elements are involved. Our experimental study demonstrates the ability of MESSIAH to produce superior quality search results.