Adaptive algorithms for managing a distributed data processing workload
IBM Systems Journal
On the Automatic Parallelization of the Perfect Benchmarks®
IEEE Transactions on Parallel and Distributed Systems
Co-array Fortran for parallel programming
ACM SIGPLAN Fortran Forum
A fast Fourier transform compiler
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Adaptive optimization in the Jalapeño JVM
OOPSLA '00 Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Quicksilver: a quasi-static compiler for Java
OOPSLA '00 Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Automatically tuned linear algebra software
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
MPI-The Complete Reference, Volume 1: The MPI Core
MPI-The Complete Reference, Volume 1: The MPI Core
Global arrays: a portable "shared-memory" programming model for distributed memory computers
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Using Processor-Cache Affinity Information in Shared-Memory Multiprocessor Scheduling
IEEE Transactions on Parallel and Distributed Systems
The Vision of Autonomic Computing
Computer
NAMD: biomolecular simulation on thousands of processors
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Blue Matter, an application framework for molecular simulation on blue gene
Journal of Parallel and Distributed Computing - High-performance computational biology
How to build a WebFountain: An architecture for very large-scale text analytics
IBM Systems Journal
Java(TM) Language Specification, The (3rd Edition) (Java (Addison-Wesley))
Java(TM) Language Specification, The (3rd Edition) (Java (Addison-Wesley))
Power Efficient Processor Architecture and The Cell Processor
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Scaling physics and material science applications on a massively parallel Blue Gene/L system
Proceedings of the 19th annual international conference on Supercomputing
Early experience with scientific applications on the blue gene/l supercomputer
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Silicon carrier for computer systems
Proceedings of the 43rd annual Design Automation Conference
Three-dimensional silicon integration
IBM Journal of Research and Development
Distributed and fault-tolerant execution framework for transaction processing
Proceedings of the 4th Annual International Conference on Systems and Storage
Hi-index | 0.00 |
A scale-out system is a collection of interconnected, modular, low-cost computers that work as a single entity to cooperatively provide applications, systems resources, and data to users. The dominant programming model for such systems consists of message passing at the systems level and multithreading at the element level. Scale-out computers have traditionally been developed and deployed to provide levels of performance (throughput and parallel processing) beyond what was achievable by large shared-memory computers that utilized the fastest processors and the most expensive memory systems. Today, exploiting scale-out at all levels in systems is becoming imperative in order to overcome a fundamental discontinuity in the development of microprocessor technology caused by power dissipation. The pervasive use of greater levels of scale-out, on the other hand, creates its own challenges in architecture, programming, systems management, and reliability. This position paper identifies some of the important research problems that must be addressed in order to deal with the technology disruption and fully realize the opportunity offered by scale-out. Our examples are based on parallelism, but the challenges we identify apply to scale-out more generally.