Holistic twig joins: optimal XML pattern matching
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Structural Joins: A Primitive for Efficient XML Query Pattern Matching
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Twig2Stack: bottom-up processing of generalized-tree-pattern queries over XML documents
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
A Static Load-Balancing Scheme for Parallel XML Parsing on Multicore CPUs
CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
A Parallel Approach to XML Parsing
GRID '06 Proceedings of the 7th IEEE/ACM International Conference on Grid Computing
Twiglist: make twig pattern matching fast
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Parsing XML using parallel traversal of streaming trees
HiPC'08 Proceedings of the 15th international conference on High performance computing
Case study of scientific data processing on a cloud using hadoop
HPCS'09 Proceedings of the 23rd international conference on High Performance Computing Systems and Applications
CIS-X: A compacted indexing scheme for efficient query evaluation of XML documents
Information Sciences: an International Journal
Hi-index | 0.00 |
With the increasing of data at an incredible rate, the development of cloud computing technologies is of critical importance to the advances of researches. The Apache Hadoop has become a widely used open source cloud computing framework that provides a distributed file system for large scale data processing. In this paper, we present a cloud computing implementation of an XML indexing method called NCIM (Node Clustering Indexing Method), which was developed by our research team, for indexing and querying a large number of big XML documents using MapReduce. The experimental results show that NCIM is suitable for cloud computing environment. The throughput of 1200 queries per second for huge amount of queries using a 15-node cluster signifies the potential applications of NCIM to the fast query processing of enormous Internet documents.