Distributed databases principles and systems
Distributed databases principles and systems
Principles of distributed database systems
Principles of distributed database systems
Fundamentals of database systems (2nd ed.)
Fundamentals of database systems (2nd ed.)
Why decision support fails and how to fix it
ACM SIGMOD Record
Adaptive parallel aggregation algorithms
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
An overview of data warehousing and OLAP technology
ACM SIGMOD Record
Daytona and the fourth-generation language Cymbal
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Proceedings of the eighth international conference on Information and knowledge management
Parallel algorithms for the execution of relational database operations
ACM Transactions on Database Systems (TODS)
The state of the art in distributed query processing
ACM Computing Surveys (CSUR)
Optimizing object queries using an effective calculus
ACM Transactions on Database Systems (TODS)
Deriving traffic demands for operational IP networks: methodology and experience
IEEE/ACM Transactions on Networking (TON)
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals
Data Mining and Knowledge Discovery
Prototyping Bubba, A Highly Parallel Database System
IEEE Transactions on Knowledge and Data Engineering
Complex Aggregation at Multiple Granularities
EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
The MD-join: An Operator for Complex OLAP
Proceedings of the 17th International Conference on Data Engineering
Fast Computation of Sparse Datacubes
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Querying Multiple Features of Groups in Relational Databases
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Generalized MD-Joins: Evaluation and Reduction to SQL
DBTel '01 Proceedings of the VLDB 2001 International Workshop on Databases in Telecommunications II
Ad Hoc OLAP: Expression and Evaluation
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Measurement and analysis of IP network usage and behavior
IEEE Communications Magazine
Using grouping variables to express complex decision support queries
Data & Knowledge Engineering
A Query Cache Tool for Optimizing Repeatable and Parallel OLAP Queries
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Distributed online aggregations
Proceedings of the VLDB Endowment
Efficient updates for a shared nothing analytics platform
Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
Brown dwarf: a P2P data-warehousing system
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
θ-Constrained multi-dimensional aggregation
Information Systems
Online querying of d-dimensional hierarchies
Journal of Parallel and Distributed Computing
Brown Dwarf: A fully-distributed, fault-tolerant data warehousing system
Journal of Parallel and Distributed Computing
A framework for building logical schema and query decomposition in data warehouse federations
ICCCI'11 Proceedings of the Third international conference on Computational collective intelligence: technologies and applications - Volume Part I
OLAP query reformulation in peer-to-peer data warehousing
Information Systems
Avatara: OLAP for web-scale analytics products
Proceedings of the VLDB Endowment
A formal framework for query decomposition and knowledge integration in data warehouse federations
Expert Systems with Applications: An International Journal
Hi-index | 0.01 |
The success of Internet applications has led to an explosive growth in the demand for bandwidth from Internet Service Providers. Managing an Internet protocol network requires collecting and analyzing network data, such as flow-level traffic statistics. Such analyses can typically be expressed as OLAP queries, e.g., correlated aggregate queries and data cubes. Current day OLAP tools for this task assume the availability of the data in a centralized data warehouse. However, the inherently distributed nature of data collection and the huge amount of data extracted at each collection point make it impractical to gather all data at a centralized site. One solution is to maintain a distributed data warehouse, consisting of local data warehouses at each collection point and a coordinator site, with most of the processing being performed at the local sites. In this paper, we consider the problem of efficient evaluation of OLAP queries over a distributed data warehouse. We have developed the Skalla system for this task. Skalla translates OLAP queries, specified as certain algebraic expressions, into distributed evaluation plans which are shipped to individual sites. A salient property of our approach is that only partial results are shipped-never parts of the detail data. We propose a variety of optimizations to minimize both the synchronization traffic and the local processing done at each site. We finally present an experimental study based on TPC-R data. Our results demonstrate the scalability of our techniques and quantify the performance benefits of the optimization techniques that have gone into the Skalla system.