An overview of data warehousing and OLAP technology
ACM SIGMOD Record
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Hive: a warehousing solution over a map-reduce framework
Proceedings of the VLDB Endowment
An evaluation of alternative architectures for transaction processing in the cloud
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
An Evaluation of Distributed Datastores Using the AppScale Cloud Platform
CLOUD '10 Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing
Mesos: a platform for fine-grained resource sharing in the data center
Proceedings of the 8th USENIX conference on Networked systems design and implementation
PipeCloud: using causality to overcome speed-of-light delays in cloud-based disaster recovery
Proceedings of the 2nd ACM Symposium on Cloud Computing
Database-Agnostic Transaction Support for Cloud Infrastructures
CLOUD '11 Proceedings of the 2011 IEEE 4th International Conference on Cloud Computing
Hi-index | 0.00 |
Platform-as-a-service (PaaS) systems, such as Google App Engine (GAE), simplify web application development and cloud deployment by providing developers with complete software stacks: runtime systems and scalable services accessible from well-defined APIs. Extant PaaS offerings are designed and specialized to support large numbers of concurrently executing web applications (multi-tier programs that encapsulate and integrate business logic, user interface, and data persistence). To enable this, PaaS systems impose a programming model that places limits on available library support, execution duration, data access, and data persistence. Although successful and scalable for web services, such support is not as amenable to online analytical processing (OLAP), which have variable resource requirements and require greater flexibility for ad-hoc query and data analysis. OLAP of web applications is key to understanding how programs are used in live settings. In this work, we empirically evaluate OLAP support in the GAE public cloud, discuss its benefits, and limitations. We then present an alternate approach, which combines the scale of GAE with the flexibility of customizable offline data analytics. To enable this, we build upon and extend the AppScale PaaS - an open source private cloud platform that is API-compatible with GAE. Our approach couplesGAE and AppScale to provide a hybrid cloud that transparently shares data between public and private platforms, and decouples public application execution from private analytics over the same datasets. Our extensions to AppScale eliminate the restrictions GAE imposes and integrates popular data analytic programming models to provide a framework for complex analytics, testing, and debugging of live GAE applications with low overhead and cost.