FFTs in external or hierarchical memory
The Journal of Supercomputing
The performance of multiprogrammed multiprocessor scheduling algorithms
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The performance advantages of integrating block data transfer in cache-coherent multiprocessors
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
The SGI Origin: a ccNUMA highly scalable server
Proceedings of the 24th annual international symposium on Computer architecture
The MPEG-4 Book
IEEE Transactions on Parallel and Distributed Systems
Memory-Intensive Benchmarks: IRAM vs. Cache-Based Machines
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
A Two-step Genetic Algorithm for Mapping Task Graphs to a Network on Chip Architecture
DSD '03 Proceedings of the Euromicro Symposium on Digital Systems Design
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Dynamic Critical Path Scheduling Parallel Programs onto Multiprocessors
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 8 - Volume 09
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
An architectural co-synthesis algorithm for energy-aware Network-on-Chip design
Journal of Systems Architecture: the EUROMICRO Journal
Achieving predictable performance through better memory controller placement in many-core CMPs
Proceedings of the 36th annual international symposium on Computer architecture
Handling the problems and opportunities posed by multiple on-chip memory controllers
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Hi-index | 0.00 |
In this paper, we study and analyze process scheduling for multicore processors. It is expected that hundreds of cores will be integrated on a single chip, known as a Chip Multiprocessor (CMP). However, operating system process scheduling, one of the most important design issue for CMP systems, has not been well addressed. We define a model for future CMPs, based on which a minimal average accessing time scheduling algorithm is proposed to reduce on-chip communication latencies and improve performance. The impact of memory access and inter process communication (IPC) in scheduling are analyzed. We explore six typical core allocation strategies. Results show that, a strategy with the minimal average accessing time of both core-core and core-memory outperforms other strategies, the overall performance for three applications (FFT, LU and H.264) has improved for 8.23%, 4.81% and 10.21% respectively comparing with other strategies.