XRANK: ranked keyword search over XML documents
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Bloom Filter-Based XML Packets Filtering for Millions of Path Queries
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Efficient keyword search for smallest LCAs in XML databases
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Multiway SLCA-based keyword search in XML data
Proceedings of the 16th international conference on World Wide Web
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient LCA based keyword search in xml data
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
XBird/D: distributed and parallel XQuery processing using remote proxy
Proceedings of the 2008 ACM symposium on Applied computing
GMX: an XML data partitioning scheme for holistic twig joins
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Hash-Search: An Efficient SLCA-Based Keyword Search Algorithm on XML Documents
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Hi-index | 0.00 |
Large scales of XML information comes continually from new Web applications, and SLCA (Smallest Lowest Common Ancestor)-based XML keyword search is one of the most important information retrieval approaches. Previous approaches focus on building index for XML documents. However in information dissemination scenario, it is impossible to build index in advance for continuous XML document streams. This paper addresses SLCA-based keyword search for continuous XML documents by Map-Reduce mechanism. We use parallel algorithms to process plenty of XML documents in Hadoop environment. A distributed SLCA computation method is designed, where each net node computes SLCA independently and just a little information needs be transmitted. A real Hadoop environment is built and we demonstrate the efficiency of our algorithms analytically and experimentally.