WWW '99 Proceedings of the eighth international conference on World Wide Web
The XXL search engine: ranked retrieval of XML data using indexes and ontologies
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
XRANK: ranked keyword search over XML documents
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
PIX: exact and approximate phrase matching in XML
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
XIRQL: An XML query language based on information retrieval concepts
ACM Transactions on Information Systems (TOIS)
Texquery: a full-text search extension to xquery
Proceedings of the 13th international conference on World Wide Web
FleXPath: flexible structure and full-text querying for XML
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Integrating document and data retrieval based on XML
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
In recent years, more and more XML repositories are emerging, e.g., XML digital library, SIGMOD and DBLP document collections. Since XML is good at representing both structured and unstructured data, to facilitate the usage of this kind of information, it is necessary to support structure-based and content-based (full-text) queries/retrievals over XML repositories. With existing XPath/XQuery Full-Text, user could do search based on cardinality, proximity or distance predicates. In this paper, we propose an efficient approach for the Information Retrieval (IR) style search, especially distance predicates search, on XML documents. Numbering technique is employed to encode XML documents, and then three algorithms are designed to evaluate queries with distance predicates. To improve the performance, some optimization techniques are introduced. Extensive experiments show the effectiveness and efficiency of the proposed approach.