Encapsulation of parallelism in the Volcano query processing system
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Parallel database systems: the future of high performance database systems
Communications of the ACM
IBM Systems Journal
ACM SIGMOD Record
Eddies: continuously adaptive query processing
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Optimization of parallel query execution plans in XPRS
PDIS '91 Proceedings of the first international conference on Parallel and distributed information systems
Garlic: a new flavor of federated query processing for DB2
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Prototyping Bubba, A Highly Parallel Database System
IEEE Transactions on Knowledge and Data Engineering
The Gamma Database Machine Project
IEEE Transactions on Knowledge and Data Engineering
Dynamic and Load-balanced Task-Oriented Datbase Query Processing in Parallel Systems
EDBT '92 Proceedings of the 3rd International Conference on Extending Database Technology: Advances in Database Technology
An Effective Algorithm for Parallelizing Hash Joins in the Presence of Data Skew
Proceedings of the Seventh International Conference on Data Engineering
Don't Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
NCR 3700 - The Next-Generation Industrial Database Computer
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Dynamic Multi-Resource Load Balancing in Parallel Database Systems
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Managing Intra-operator Parallelism in Parallel Database Systems
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Highly available, fault-tolerant, parallel dataflows
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Resource Scheduling for Parallel Query Processing on Computational Grids
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Tuple routing strategies for distributed eddies
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Adaptive workload allocation in query processing in autonomous heterogeneous environments
Distributed and Parallel Databases
Autonomic query parallelization using non-dedicated computers: an evaluation of adaptivity options
The VLDB Journal — The International Journal on Very Large Data Bases
Automation everywhere: autonomics and data management
BNCOD'07 Proceedings of the 24th British national conference on Databases
Load-balancing for WAN warehouses
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Run-time adaptivity for search computing
Search computing
An efficient skew-insensitive algorithm for join processing on grid architectures
Proceedings of the fifth international workshop on High-level parallel programming and applications
Efficient load balancing in partitioned queries under random perturbations
ACM Transactions on Autonomous and Adaptive Systems (TAAS) - Special section on formal methods in pervasive computing, pervasive adaptation, and self-adaptive systems: Models and algorithms
Utility-driven adaptive query workload execution
Future Generation Computer Systems
Just-in-time data distribution for analytical query processing
ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
Hi-index | 0.00 |
We present DITN, a new method of parallel querying based on dynamic outsourcing of join processing tasks to non-dedicated, heterogeneous computers. In DITN, partitioning is not the means of parallelism. Data layout decisions are taken outside the scope of the DBMS, and handled within the storage software; query processors see a "Data In The Network" image. This allows gradual scaleout as the workload grows, by using non-dedicated computers.A typical operator in a parallel query plan is Exchange [7]. We argue that Exchange is unsuitable for non-dedicated machines because it poorly addresses node heterogeneity, and is vulnerable to failures or load spikes during query execution. DITN uses an alternate intra-fragment parallelism where each node executes an independent select-project-join-aggregate-group by block, with no tuple exchange between nodes. This method cleanly handles heterogeneous nodes, and well adapts during execution to node failures or load spikes.Initial experiments suggest that DITN performs competitively with a traditional configuration of dedicated machines and well-partitioned data for up to 10 processors at least. At the same time, DITN gives significant flexibility in terms of gradual scaleout and handling of heterogeneity, load bursts, and failures.