MULTILISP: a language for concurrent symbolic computation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Hierarchical cache/bus architecture for shared memory multiprocessors
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Parallel implementation of OPS5 on the encore multiprocessor: results and analysis
International Journal of Parallel Programming
Characterization of parallelism and deadlocks in distributed digital logic simulation
DAC '89 Proceedings of the 26th ACM/IEEE Design Automation Conference
COOL: a language for parallel programming
Selected papers of the second workshop on Languages and compilers for parallel computing
LocusRoute: a parallel global router for standard cells
DAC '88 Proceedings of the 25th ACM/IEEE Design Automation Conference
Queue-based multi-processing LISP
LFP '84 Proceedings of the 1984 ACM Symposium on LISP and functional programming
Processor scheduling in shared memory multiprocessors
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The performance of multiprogrammed multiprocessor scheduling algorithms
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Processor-pool-based scheduling for large-scale NUMA multiprocessors
SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Experimental Evaluation of a Real-Time Scheduler for a Multiprocessor System
IEEE Transactions on Software Engineering
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Scheduler activations: effective kernel support for the user-level management of parallelism
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Operating system support for parallel programming on RP3
IBM Journal of Research and Development
A customizable substrate for concurrent languages
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Application-controlled physical memory using external page-cache management
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Scheduler activations: effective kernel support for the user-level management of parallelism
ACM Transactions on Computer Systems (TOCS)
A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors
ACM Transactions on Computer Systems (TOCS)
Chores: enhanced run-time support for shared-memory parallel computing
ACM Transactions on Computer Systems (TOCS)
Using scheduler information to achieve optimal barrier synchronization performance
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Processor scheduling on multiprogrammed, distributed memory parallel computers
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Analysis of the impact of memory in distributed parallel processing systems
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Processor allocation policies for message-passing parallel computers
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Scheduling and page migration for multiprocessor compute servers
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
High performance synchronization algorithms for multiprogrammed multiprocessors
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Scheduling memory constrained jobs on distributed memory parallel computers
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
The interaction of parallel and sequential workloads on a network of workstations
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Coordinated allocation of memory and processors in multiprocessors
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Operating system support for improving data locality on CC-NUMA compute servers
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
On multiprocessor system scheduling
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
An analysis of gang scheduling for multiprogrammed parallel computing environments
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Scheduler-conscious synchronization
ACM Transactions on Computer Systems (TOCS)
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Dynamic resource management on distributed systems using reconfigurable applications
IBM Journal of Research and Development - Special issue: performance analysis and its impact on design
Processor Saving Scheduling Policies for Multiprocessor Systems
IEEE Transactions on Computers
Thread scheduling for multiprogrammed multiprocessors
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Dependence driven execution for multiprogrammed multiprocessor
ICS '98 Proceedings of the 12th international conference on Supercomputing
Kernel-level scheduling for the nano-threads programming model
ICS '98 Proceedings of the 12th international conference on Supercomputing
Flexible use of memory for replication/migration in cache-coherent DSM multiprocessors
Proceedings of the 25th annual international symposium on Computer architecture
Performance isolation: sharing and isolation in shared-memory multiprocessors
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Compile/run-time support for threaded MPI execution on multiprogrammed shared memory machines
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Preemptive scheduling of parallel jobs on multiprocessors
Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms
Performance of Hierarchical Processor Scheduling in Shared-Memory Multiprocessor Systems
IEEE Transactions on Computers
Adaptive two-level thread management for fast MPI execution on shared memory machines
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Performance prediction based loop scheduling for heterogeneous computing environment
SAC '97 Proceedings of the 1997 ACM symposium on Applied computing
First-class user-level threads
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Symbiotic jobscheduling for a simultaneous mutlithreading processor
ACM SIGPLAN Notices
Program transformation and runtime support for threaded MPI execution on shared-memory machines
ACM Transactions on Programming Languages and Systems (TOPLAS)
Improving Gang Scheduling through job performance analysis and malleability
ICS '01 Proceedings of the 15th international conference on Supercomputing
Symbiotic jobscheduling for a simultaneous multithreaded processor
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
CTK: Configurable Object Abstractions for Multiprocessors
IEEE Transactions on Software Engineering
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Pthreads for dynamic and irregular parallelism
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Performance characteristics of gang scheduling in multiprogrammed environments
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Symbiotic jobscheduling with priorities for a simultaneous multithreading processor
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Affinity scheduling of unbalanced workloads
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Using Processor Affinity in Loop Scheduling on Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Analysis of Processor Allocation in Multiprogrammed, Distributed-Memory Parallel Processing Systems
IEEE Transactions on Parallel and Distributed Systems
Parallel Models and Job Characterization for System Scheduling
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
HPCN Europe 2000 Proceedings of the 8th International Conference on High-Performance Computing and Networking
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Analysis of Several Scheduling Algorithms under the Nano-Thread Programming Model
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Production Job Scheduling for Parallel Shared Memory Systems
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Improving Processor Allocation through Run-Time Measured Efficiency
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Tool to Schedule Parallel Applications on Multiprocessors: The NANOS CPU MANAGER
IPDPS '00/JSSPP '00 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Gang scheduling for highly efficient, distributed multiprocessor systems
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Multitasking and Multithreading on a Multiprocessor with Virtual Shared Memory
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Distributed Job Scheduling in SCI Local-Area MultiProcessors
HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
User-Level Dynamic Page Migration for Multiprogrammed Shared-Memory Multiprocessors
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Non-clair voy ant multiprocessor scheduling of jobs with changing execution characteristics
Journal of Scheduling - Special issue: On-line scheduling
Thread coloring: a scheduler proposal from user to hardware threads
ACM SIGOPS Operating Systems Review
Real-Time Systems
Adaptive scheduling with parallelism feedback
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Adaptive work stealing with parallelism feedback
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
International Journal of High Performance Computing Applications
Performance-driven processor allocation
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
HACC: an architecture for cluster-based web servers
WINSYM'99 Proceedings of the 3rd conference on USENIX Windows NT Symposium - Volume 3
Adaptive and reliable parallel computing on networks of workstations
ATEC '97 Proceedings of the annual conference on USENIX Annual Technical Conference
Towards effective user-controlled scheduling for microkernel-based systems
ACM SIGOPS Operating Systems Review
TxLinux: using and managing hardware transactional memory in an operating system
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Adaptive work-stealing with parallelism feedback
ACM Transactions on Computer Systems (TOCS)
Contention-aware scheduler: unlocking execution parallelism in multithreaded java programs
Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
Improved results for scheduling batched parallel jobs by using a generalized analysis framework
Journal of Parallel and Distributed Computing
Provably efficient two-level adaptive scheduling
JSSPP'06 Proceedings of the 12th international conference on Job scheduling strategies for parallel processing
Non-clairvoyant batch sets scheduling: fairness is fair enough
ESA'07 Proceedings of the 15th annual European conference on Algorithms
Dynamic load balancing in MPI jobs
ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
BWS: balanced work stealing for time-sharing multicores
Proceedings of the 7th ACM european conference on Computer Systems
Efficient multiprogramming for multicores with SCAF
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
DWS: Demand-aware Work-Stealing in Multi-programmed Multi-core Architectures
Proceedings of Programming Models and Applications on Multicores and Manycores
Competitive online adaptive scheduling for sets of parallel jobs with fairness and efficiency
Journal of Parallel and Distributed Computing
Hi-index | 0.01 |
Shared-memory multiprocessors are frequently used in a time-sharing style with multiple parallel applications executing at the same time. In such an environment, where the machine load is continuously varying, the question arises of how an application should maximize its performance while being fair to other users of the system. In this paper, we address this issue. We first show that if the number of runnable processes belonging to a parallel application significantly exceeds the effective number of physical processors executing it, its performance can be significantly degraded. We then propose a way of controlling the number of runnable processes associated with an application dynamically, to ensure good performance. The optimal number of runnable processes for each application is determined by a centralized server, and applications dynamically suspend or resume processes in order to match that number. A preliminary implementation of the proposed scheme is now running on the Encore Multimax and we show how it helps improve the performance of several applications. In some cases the improvement is more than a factor of two. We also discuss implications of the proposed scheme for multiprocessor schedulers, and how the scheme should interface with parallel programming languages.