What have we learnt from using real parallel machines to solve real problems?
C3P Proceedings of the third conference on Hypercube concurrent computers and applications - Volume 2
Performance characterization of a Quad Pentium Pro SMP using OLTP workloads
Proceedings of the 25th annual international symposium on Computer architecture
Performance of database workloads on shared-memory systems with out-of-order processors
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Oracle distributed systems
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Database Systems Concepts
Oracle SQL Tuning Pocket Reference
Oracle SQL Tuning Pocket Reference
High-level Parallelism in a Database Cluster: A Feasibility Study Using Document Services
Proceedings of the 17th International Conference on Data Engineering
Cache-Aware Query Routing in a Cluster of Databases
Proceedings of the 17th International Conference on Data Engineering
DBMSs on a Modern Processor: Where Does Time Go?
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Detailed Characterization of a Quad Pentium Pro Server Running TPC-D
ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design
Unified Fine-Granularity Buffering of Index and Data: Approach and Implementation
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
OS support for a commodity database on PC clusters: distributed devices vs. distributed file systems
ADC '05 Proceedings of the 16th Australasian database conference - Volume 39
Hi-index | 0.00 |
Designing clusters of PCs for distributed databases processing OLAP(On Line Analytical Processing) workloads in parallel with good scalability remains a particular challenge as we are lacking a deep understanding of the architectural issues around resource usage by standard DBMSs on distributed platforms.To address this problem, we present a novel performance monitoring framework for filtering and abstracting samples of performance data from low level counters into a high level performance picture. Our framework is used side by side with the DBMS and delivers many interesting insights about the most critical resource in the different queries and systems configuration. As required for a larger distributed hardware/software system, our solution comprises software instrumentation at the OS level, tools for gathering performance relevant data and an analytical model for performance evaluation and performance prediction to future platforms.We demonstrate the viability of our approach with the in-depth analysis of distributed TPC-D, a standard OLAP benchmark running on clusters of commodity PCs. Based on the data provided by our framework, we isolate and resolve a few crucial performance issues of OLAP workloads on clusters. For different queries, we give a workload characterization in terms of resource usage, quantify the optimal scalability and investigate the impact of the networking speed on the overall application performance. We show that the disk performance and CPU speed remains the most critical resource bottleneck for most queries. Queries with a lot of inter-node communication are limited by the communication software inefficiency within the DBMS and not by the raw networking speeds. A systematic performance evaluation constitutes a solid basis for architectural decisions and system optimization in clusters of PCs that are dedicated to large parallel database systems.