A fast algorithm for particle simulations
Journal of Computational Physics
Communications of the ACM
The effect of time constraints on scaled speedup
SIAM Journal on Scientific and Statistical Computing
Scalability of parallel machines
Communications of the ACM
Journal of Parallel and Distributed Computing
Journal of Parallel and Distributed Computing
The directory-based cache coherence protocol for the DASH multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Working sets, cache sizes, and node granularity issues for large-scale multiprocessors
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
An empirical comparison of the Kendall Square Research KSR-1 and Stanford DASH multiprocessors
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Modeling communication in parallel algorithms: a fruitful interaction between theory and systems?
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Cost/performance of a parallel computer simulator
PADS '94 Proceedings of the eighth workshop on Parallel and distributed simulation
How to Measure, Present, and Compare Parallel Performance
IEEE Parallel & Distributed Technology: Systems & Technology
The performance advantages of integrating block data transfer in cache-coherent multiprocessors
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Implications of hierarchical N-body methods for multiprocessor architectures
ACM Transactions on Computer Systems (TOCS)
Future applicability of bus-based shared memory multiprocessors
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Future applicability of bus-based shared memory multiprocessors
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Towards modeling the performance of a fast connected components algorithm on parallel machines
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Application and architectural bottlenecks in large scale distributed shared memory machines
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Thread scheduling for cache locality
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Modeling cost/performance of a parallel computer simulator
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Tuning compiler optimizations for simultaneous multithreading
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A methodology and an evaluation of the SGI Origin2000
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Software-Directed Register Deallocation for Simultaneous Multithreaded Processors
IEEE Transactions on Parallel and Distributed Systems
Portable and Efficient Parallel Computing Using the BSP Model
IEEE Transactions on Computers
Tuning Compiler Optimizations for Simultaneous Multithreading
International Journal of Parallel Programming - Special issue on the 30th annual ACM/IEEE international symposium on microarchitecture, part II
Scalable Parallel Genetic Algorithms
Artificial Intelligence Review
Relationships Between Efficiency and Execution Time of Full Multigrid Methods on Parallel Computers
IEEE Transactions on Parallel and Distributed Systems
Next Generation System Software for Future High-End Computing Systems
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Performance of Scheduling Scientific Applications with Adaptive Weighted Factoring
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
The Forgotten Factor: Facts on Performance Evaluation and Its Dependence on Workloads
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
A Methodology for User-Oriented Scalability Analysis
ASAP '97 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
Banyan: A Language for Scalable Parallel Programming on Loosely Coupled Distributed Systems
HPC-ASIA '97 Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
Integrated control of distributed volume visaulization through the World-Wide-Web
VIS '94 Proceedings of the conference on Visualization '94
Dyn-MPI: Supporting MPI on Non Dedicated Clusters
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Dyn-MPI: Supporting MPI on medium-scale, non-dedicated clusters
Journal of Parallel and Distributed Computing
Distributed filaments: efficient fine-grain parallelism on a cluster of workstations
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
A regression-based approach to scalability prediction
Proceedings of the 22nd annual international conference on Supercomputing
Scheduling Parallel Tasks with Communication Overhead in an Environment with Multiple Machines
IEICE - Transactions on Information and Systems
ACM SIGARCH Computer Architecture News
Scalability analysis of parallel systems with multiple components of work
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Capacity metric for chip heterogeneous multiprocessors
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
A simplified contact-friction methodology for modeling wire breaks in parallel wire strands
Computers and Structures
A study of average-case speedup and scalability of parallel computations on static networks
Mathematical and Computer Modelling: An International Journal
Hi-index | 4.10 |
Models for the constraints under which an application should be scaled, including constant problem-size scaling, memory-constrained scaling, and time-constrained scaling, are reviewed. A realistic method is described that scales all relevant parameters under considerations imposed by the application domain. This method leads to different conclusions about the effectiveness and design of large multiprocessors than the naive practice of scaling only the data set size. The primary example application is a simulation of galaxies using the Barnes-Hut hierarchical N-body method.