Executing a program on the MIT tagged-token dataflow architecture
Volume II: Parallel Languages on PARLE: Parallel Architectures and Languages Europe
Dynamic distributed query processing techniques
CSC '89 Proceedings of the 17th conference on ACM Annual Computer Science Conference
Cost-performance analysis of heterogeneity in supercomputer architectures
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
A four megabit dynamic systolic associative memory chip
Journal of VLSI Signal Processing Systems - Special issue: application specific array processors
Active messages: a mechanism for integrated communication and computation
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Contrasting characteristics and cache performance of technical and multi-user commercial workloads
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A super scalar sort algorithm for RISC processors
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Coherence controller architectures for SMP-based CC-NUMA multiprocessors
Proceedings of the 24th annual international symposium on Computer architecture
In search of clusters (2nd ed.)
In search of clusters (2nd ed.)
Memory system characterization of commercial workloads
Proceedings of the 25th annual international symposium on Computer architecture
Active pages: a computation model for intelligent memory
Proceedings of the 25th annual international symposium on Computer architecture
Active disks: programming model, algorithms and evaluation
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Mapping irregular applications to DIVA, a PIM-based data-intensive architecture
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Smart Memories: a modular reconfigurable architecture
Proceedings of the 27th annual international symposium on Computer architecture
Architecture and design of AlphaServer GS320
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
ACM Transactions on Database Systems (TODS)
Design and evaluation of a smart disk cluster for DSS commercial workloads
Journal of Parallel and Distributed Computing - Special issue on cluster and network-based computing
IEEE Micro
Active Storage for Large-Scale Data Mining and Multimedia
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
DBMSs on a Modern Processor: Where Does Time Go?
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
A Heterogeneous Hierarchical Solution to Cost-efficient High Performance Computing
SPDP '96 Proceedings of the 8th IEEE Symposium on Parallel and Distributed Processing (SPDP '96)
FlexRAM: Toward an Advanced Intelligent Memory System
ICCD '99 Proceedings of the 1999 IEEE International Conference on Computer Design
Characteristics of production database workloads and the TPC benchmarks
IBM Systems Journal - End-to-end security
An efficient and transparent transaction management based on the data workflow of HVEM DataGrid
CLADE '08 Proceedings of the 6th international workshop on Challenges of large applications in distributed environments
Hi-index | 0.00 |
Commercial workloads impose heavy demands on memory and storage subsystems in a server and often result in a large amount of traffic in I/O and memory buses. To reduce the data movement between the storage subsystem and the processing units, we propose a hierarchical computing (HC) system that distributes processing elements across the storage hierarchy. We present a programming model that allows us to decompose database queries into simple operations. These operations are then distributed and executed by the different layers of the hierarchy depending on the affinity of the task to a particular layer. Commands percolate down into the lower layers of the hierarchy and partially processed information flows up into the higher layers, where subsequent operations can be performed. We evaluate the effectiveness of the proposed hierarchical computing model by performing full system simulations of a business decision support system (DSS) workload. On a group of TPC-H-like queries, hierarchical computing systems reduce the amount of data transferred over the processor to memory interconnect by 37-58 percent. We also observe that HC configurations show speedups between 1.14x and 1.45x when compared with CC-NUMA with 32 processors.