Experiences with the Amoeba distributed operating system
Communications of the ACM
BProc: the Beowulf distributed process space
ICS '02 Proceedings of the 16th international conference on Supercomputing
Introduction to the cell multiprocessor
IBM Journal of Research and Development - POWER5 and packaging
ZOID: I/O-forwarding infrastructure for petascale architectures
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Larrabee: a many-core x86 architecture for visual computing
ACM SIGGRAPH 2008 papers
Entering the petaflop era: the architecture and performance of Roadrunner
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
A file I/O system for many-core based clusters
Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
A design of hybrid operating system for a parallel computer with multi-core and many-core processors
Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
Hi-index | 0.00 |
Dedicated processors that are specialized for numerical computations, such as the Cell/B.E. and vector processors, tend to have low performance in the integer computations required by operating systems. To solve this problem, we propose a remote process and remote file I/O management architecture that enables processes on compute nodes that have dedicated processors to be executed from a management node that has general-purpose processors. The architecture allows the processes and files to be managed as a single system. The management node provides general OS functions such as process management and file I/O, while the compute nodes are dedicated to executing numerical application programs. It makes it possible to take advantage of the characteristics of each processor and achieves efficient execution of both OS functions and applications. In this architecture, our heterogeneity-aware binary loader allows programs to be executed on the compute nodes of different types of processors, while our remote file I/O function transparently executes file I/O issued by programs running on the compute nodes at the management node. The proposed architecture has been integrated into the Linux kernel. The system was evaluated using the cluster of an x86_64 node and 16 Cell/B.E. nodes. The results showed that compared to when only compute nodes are used, process invocation is 41 times as faster than rsh, and 1.6 times faster for the start-up time of an MPI program as well. Also for remote file I/O, performance twice as fast as NFS is achieved, and a 30% reduction in execution time was confirmed for the NAS Parallel Benchmark BTIO.