The Index-Based XXL Search Engine for Querying XML Data with Relevance Ranking
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Generating Relations from XML Documents
ICDT '03 Proceedings of the 9th International Conference on Database Theory
Querying XML Documents Made Easy: Nearest Concept Queries
Proceedings of the 17th International Conference on Data Engineering
XRANK: ranked keyword search over XML documents
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
FleXPath: flexible structure and full-text querying for XML
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Efficient keyword search for smallest LCAs in XML databases
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
The SphereSearch engine for unified ranked retrieval of heterogeneous XML and web documents
VLDB '05 Proceedings of the 31st international conference on Very large data bases
An efficient and versatile query engine for TopX search
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Interconnection semantics for keyword search in XML
Proceedings of the 14th ACM international conference on Information and knowledge management
Proceedings of the 15th international conference on World Wide Web
Multiway SLCA-based keyword search in XML data
Proceedings of the 16th international conference on World Wide Web
BLINKS: ranked keyword searches on graphs
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Identifying meaningful return information for XML keyword search
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
XSEarch: a semantic search engine for XML
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
SEDA: a system for search, exploration, discovery, and analysis of XML Data
Proceedings of the VLDB Endowment
WikiAnalytics: disambiguation of keyword search results on highly heterogeneous structured data
Procceedings of the 13th International Workshop on the Web and Databases
Hi-index | 0.00 |
The ability to perform effective XML data retrieval in the absence of schema knowledge has recently received considerable attention. The majority of relevant proposals employs heuristics that identify groups of meaningfully related nodes using information extracted from the input data. These heuristics are employed to effectively prune the search space of all possible node combinations and their popularity is evident by the large number of such heuristics and the systems that use them. However, a comprehensive study detailing the relative merits of these heuristics has not been performed thus far. One of the challenges in performing this study is the fact that these techniques have been proposed within different and not directly comparable contexts. In this paper, we attempt to fill this gap. In particular, we first abstract the common selection problem that is tackled by the relatedness heuristics and show how each heuristic addresses this problem. We then identify data categories where the assumptions made by each heuristic are valid and draw insights on their possible effectiveness. Our findings can help systems implementors understand the strengths and weaknesses of each heuristic and provide simple guidelines for the applicability of each one.