Covering indexes for branching path queries
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Holistic twig joins: optimal XML pattern matching
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Index Structures for Path Expressions
ICDT '99 Proceedings of the 7th International Conference on Database Theory
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Peer-to-peer management of XML data: issues and research challenges
ACM SIGMOD Record
An Efficient XPath Query Processor for XML Streams
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
XMark: a benchmark for XML data management
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Locating data sources in large distributed systems
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
XML processing in DHT networks
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
An evaluation of alternative architectures for transaction processing in the cloud
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Runtime measurements in the cloud: observing, analyzing, and reducing variance
Proceedings of the VLDB Endowment
Predicting cost amortization for query services
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Efficient XQuery rewriting using multiple views
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Adapting microsoft SQL server for cloud computing
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Optimal Service Pricing for a Cloud Cache
IEEE Transactions on Knowledge and Data Engineering
Building Large XML Stores in the Amazon Cloud
ICDEW '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering Workshops
HadoopXML: a suite for parallel processing of massive XML data with multiple twig pattern queries
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
An increasing part of the world's data is either shared through the Web or directly produced through and for Web platforms, in particular using structured formats like XML or JSON. Cloud platforms are interesting candidates to handle large data repositories, due to their elastic scaling properties. Popular commercial clouds provide a variety of sub-systems and primitives for storing data in specific formats (files, key-value pairs etc.) as well as dedicated sub-systems for running and coordinating execution within the cloud. We propose an architecture for warehousing large-scale Web data, in particular XML, in a commercial cloud platform, specifically, Amazon Web Services. Since cloud users support monetary costs directly connected to their consumption of cloud resources, we focus on indexing content in the cloud. We study the applicability of several indexing strategies, and show that they lead not only to reducing query evaluation time, but also, importantly, to reducing the monetary costs associated with the exploitation of the cloud-based warehouse. Our architecture can be easily adapted to similar cloud-based complex data warehousing settings, carrying over the benefits of access path selection in the cloud.